llama.cpp
CUDA: experimental native mxfp4 support for blackwell
#17906
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
15
Changes
View On
GitHub
CUDA: experimental native mxfp4 support for blackwell
#17906
am17an
merged 15 commits into
ggml-org:master
from
am17an:mxfp4
am17an
requested a review
from
JohannesGaessler
19 days ago
github-actions
added
Nvidia GPU
github-actions
added
ggml
am17an
marked this pull request as draft
19 days ago
CISC
commented on 2025-12-10
am17an
force pushed
from
16e8a11d
to
9dde464c
19 days ago
JohannesGaessler
commented on 2025-12-10
JohannesGaessler
commented on 2025-12-11
am17an
force pushed
from
b978da31
to
a1672f62
18 days ago
am17an
force pushed
from
870c9d72
to
61c41a0d
17 days ago
mediouni-m
commented on 2025-12-14
mediouni-m
commented on 2025-12-14
am17an
force pushed
from
8acc50ba
to
83f62fb9
12 days ago
am17an
requested a review
from
JohannesGaessler
12 days ago
JohannesGaessler
commented on 2025-12-17
am17an
force pushed
from
f5e46e55
to
40b7df86
12 days ago
am17an
force pushed
from
e4c2ac39
to
43c262b9
12 days ago
am17an
force pushed
from
43c262b9
to
bdde3284
12 days ago
am17an
force pushed
from
bdde3284
to
5dfa63da
11 days ago
am17an
marked this pull request as ready for review
11 days ago
am17an
requested a review
from
JohannesGaessler
10 days ago
ORippler
commented on 2025-12-19
CUDA: experimental native mxfp4 support for blackwell
4bc93a6c
optimize load_tiles
38648a4f
optimize quantize_mxfp4
f8d20c59
cleanup
ae617ac0
first pass review: formatting
b0e3c620
use interleaved layout for mma
9ebe043c
mmq: add assert for size
d7edade3
use __nv_fp4x4_e2m1
97460714
use iter_k as 512, cleanup
d6cb832b
Use 1200 as blackwell instead of 1000
3fca15c2
address review comments
41329d35
mmq: fix stride
b3919910
am17an
force pushed
from
5dfa63da
to
c5799d69
9 days ago
am17an
force pushed
from
c5799d69
to
092f8612
9 days ago
am17an
changed the title
CUDA: experimental native mxfp4 support for blackwell [WIP]
CUDA: experimental native mxfp4 support for blackwell
8 days ago
quantize.cu: use reference impl of e8m0 scale
404054c1
am17an
force pushed
from
092f8612
to
404054c1
8 days ago
JohannesGaessler
commented on 2025-12-22
address review comments
5d3780d4
JohannesGaessler
approved these changes on 2025-12-24
JohannesGaessler
commented on 2025-12-24
add 120f-virtual + minor fixes
dc04da57
am17an
merged
c8a2417d
into master
5 days ago
am17an
deleted the mxfp4 branch
5 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
JohannesGaessler
ggerganov
CISC
ORippler
woachk
mediouni-m
Assignees
No one assigned
Labels
Nvidia GPU
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub