llama.cpp
llama : add gpt-oss
#15091
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
48
Changes
View On
GitHub
llama : add gpt-oss
#15091
ggerganov
merged 48 commits into
master
from
gpt-oss-mxfp4
oai moe
81991fcd
compat with new checkpoint
917f9233
add attn sink impl
a4ab8693
add rope scaling yarn
3801c364
logits match with latest transformers code
13f39f6b
wip chat template
b3594b30
Merge branch 'master' into xsn/oai_moe
bd571580
rm trailing space
089a7ab4
use ggml_scale_bias
4d01b36b
Merge branch 'master' into xsn/oai_moe
f271cc80
rm redundant is_swa_all
106b17e5
convert interleaved gate_up
e2c1beb3
Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg
4431c823
Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg
fe9b818b
Merge remote-tracking branch 'gg-public/master' into xsn/oai_moe-gg
539c2b63
graph : fix activation function to match reference (#7)
039a6f16
Merge branch 'master' into xsn/oai_moe-gg
aa240b99
Merge branch 'master' into xsn/oai_moe-gg
32a654c2
vocab : handle o200k_harmony special tokens
13f3568c
ggml : add attention sinks support (#1)
e59b2eb1
repack mxfp4 upon conversion
832dc26c
clean up a bit
c68069d1
enable thinking
423b1919
add quick hack to render only some special tokens
4dd479b7
fix bf16 conversion
ebc7da53
remove vocab hack
a543ddfd
webui ok
6b303729
support chat parsing for gpt-oss
44bdb752
Merge branch 'master' into xsn/oai_moe
65b536f9
fix webui
61979176
direct mapping mxfp4, FINALLY
3c4725ba
force using mxfp4
04cfb6d2
properly use lazy tensor
4cf69dff
ggml : add mxfp4
ec95c0e8
ggml : add ggml_add_id (#13)
3ef6c8c1
Merge branch 'master' into xsn/oai_moe
cd514cc3
Merge branch 'xsn/oai_moe' into mxfp4-rebased
98c4be53
ggerganov
requested a review
from
0cc4m
33 days ago
ggerganov
requested a review
from
JohannesGaessler
33 days ago
ggerganov
requested a review
from
ngxson
33 days ago
ggerganov
requested a review
from
slaren
33 days ago
github-actions
added
testing
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
examples
github-actions
added
python
github-actions
added
server
github-actions
added
ggml
github-actions
added
SYCL
github-actions
added
Apple Metal
github-actions
added
Ascend NPU
github-actions
added
OpenCL
Merge branch 'master' into gpt-oss-mxfp4
fcb23396
llama : fix compile error
98f34448
cuda : add fallback for __nv_cvt_e8m0_to_bf16raw
df8411ed
slaren
force pushed
from
5b6b1ffe
to
df8411ed
33 days ago
cleanup
60ab08a5
sycl : fix supports_op for MXFP4
256fe66c
fix Unknown reasoning format
cd8ed32b
ggml-cpu : fix AVX build
a3b291e8
fix hip build
1ea3769f
cuda : add mxfp4 dequantization support for cuBLAS
07d781e4
slaren
force pushed
from
ee2adc79
to
07d781e4
33 days ago
ggml-cpu : fix mxfp4 fallback definitions for some architectures
b236c90f
cuda : fix version required for __nv_cvt_e8m0_to_bf16raw
d9d89b42
slaren
approved these changes on 2025-08-05
ggerganov
merged
fd1234cb
into master
33 days ago
ggerganov
deleted the gpt-oss-mxfp4 branch
33 days ago
CISC
commented on 2025-08-08
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
CISC
ngxson
0cc4m
JohannesGaessler
Assignees
No one assigned
Labels
testing
Nvidia GPU
Vulkan
examples
python
server
ggml
SYCL
Apple Metal
Ascend NPU
OpenCL
Milestone
No milestone
Login to write a write a comment.
Login via GitHub