PR #15091 llama : add gpt-oss

llama : add gpt-oss #15091

ggerganov merged 48 commits into master from gpt-oss-mxfp4

oai moe

81991fcd

compat with new checkpoint

917f9233

add attn sink impl

a4ab8693

add rope scaling yarn

3801c364

logits match with latest transformers code

13f39f6b

wip chat template

b3594b30

Merge branch 'master' into xsn/oai_moe

bd571580

rm trailing space

089a7ab4

use ggml_scale_bias

4d01b36b

Merge branch 'master' into xsn/oai_moe

f271cc80

rm redundant is_swa_all

106b17e5

convert interleaved gate_up

e2c1beb3

Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg

4431c823

Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg

fe9b818b

Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg

539c2b63

graph : fix activation function to match reference (#7)

039a6f16

Merge branch 'master' into xsn/oai_moe-gg

aa240b99

Merge branch 'master' into xsn/oai_moe-gg

32a654c2

vocab : handle o200k_harmony special tokens

13f3568c

ggml : add attention sinks support (#1)

e59b2eb1

repack mxfp4 upon conversion

832dc26c

clean up a bit

c68069d1

enable thinking

423b1919

add quick hack to render only some special tokens

4dd479b7

fix bf16 conversion

ebc7da53

remove vocab hack

a543ddfd

webui ok

6b303729

support chat parsing for gpt-oss

44bdb752

Merge branch 'master' into xsn/oai_moe

65b536f9

fix webui

61979176

direct mapping mxfp4, FINALLY

3c4725ba

force using mxfp4

04cfb6d2

properly use lazy tensor

4cf69dff

ggml : add mxfp4

ec95c0e8

ggml : add ggml_add_id (#13)

3ef6c8c1

Merge branch 'master' into xsn/oai_moe

cd514cc3

Merge branch 'xsn/oai_moe' into mxfp4-rebased

98c4be53

ggerganov requested a review from

0cc4m 159 days ago

ggerganov requested a review from

JohannesGaessler 159 days ago

ggerganov requested a review from

ngxson 159 days ago

ggerganov requested a review from

slaren 159 days ago

github-actions added testing

github-actions added Nvidia GPU

github-actions added Vulkan

github-actions added examples

github-actions added python

github-actions added server

github-actions added ggml

github-actions added SYCL

github-actions added Apple Metal

github-actions added Ascend NPU

github-actions added OpenCL

Merge branch 'master' into gpt-oss-mxfp4

fcb23396

llama : fix compile error

98f34448

cuda : add fallback for __nv_cvt_e8m0_to_bf16raw

df8411ed

slaren force pushed to df8411ed 159 days ago

cleanup

60ab08a5

sycl : fix supports_op for MXFP4

256fe66c

fix Unknown reasoning format

cd8ed32b

ggml-cpu : fix AVX build

a3b291e8

fix hip build

1ea3769f

cuda : add mxfp4 dequantization support for cuBLAS

07d781e4

slaren force pushed to 07d781e4 159 days ago

ggml-cpu : fix mxfp4 fallback definitions for some architectures

b236c90f

cuda : fix version required for __nv_cvt_e8m0_to_bf16raw

d9d89b42

slaren approved these changes on 2025-08-05

ggerganov merged fd1234cb into master 159 days ago

ggerganov deleted the gpt-oss-mxfp4 branch 159 days ago

CISC commented on 2025-08-08

Reviewers

slaren

CISC

ngxson

0cc4m

JohannesGaessler

Assignees

No one assigned

Labels

testing Nvidia GPU Vulkan examples python server ggml SYCL Apple Metal Ascend NPU OpenCL

Milestone

No milestone

llama.cpp llama : add gpt-oss #15091 Merged

llama : add gpt-oss #15091

llama.cpp
llama : add gpt-oss
#15091

Merged