llama.cpp
CUDA: experimental native mxfp4 support for blackwell
#17906
Merged

CUDA: experimental native mxfp4 support for blackwell #17906

am17an merged 15 commits into ggml-org:master from am17an:mxfp4
am17an
am17an am17an requested a review from JohannesGaessler JohannesGaessler 19 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
am17an am17an marked this pull request as draft 19 days ago
CISC
CISC commented on 2025-12-10
am17an am17an force pushed from 16e8a11d to 9dde464c 19 days ago
easyfab
JohannesGaessler
JohannesGaessler commented on 2025-12-10
JohannesGaessler
JohannesGaessler commented on 2025-12-11
am17an am17an force pushed from b978da31 to a1672f62 18 days ago
am17an
am17an am17an force pushed from 870c9d72 to 61c41a0d 17 days ago
mediouni-m
mediouni-m commented on 2025-12-14
mediouni-m
mediouni-m commented on 2025-12-14
mediouni-m
michaelw9999
am17an
michaelw9999
am17an
am17an am17an force pushed from 8acc50ba to 83f62fb9 12 days ago
JohannesGaessler
michaelw9999
am17an am17an requested a review from JohannesGaessler JohannesGaessler 12 days ago
JohannesGaessler
JohannesGaessler
JohannesGaessler commented on 2025-12-17
am17an am17an force pushed from f5e46e55 to 40b7df86 12 days ago
am17an am17an force pushed from e4c2ac39 to 43c262b9 12 days ago
am17an am17an force pushed from 43c262b9 to bdde3284 12 days ago
am17an am17an force pushed from bdde3284 to 5dfa63da 11 days ago
am17an am17an marked this pull request as ready for review 11 days ago
am17an am17an requested a review from JohannesGaessler JohannesGaessler 10 days ago
ORippler
ORippler commented on 2025-12-19
am17an
ggerganov
CUDA: experimental native mxfp4 support for blackwell
4bc93a6c
optimize load_tiles
38648a4f
optimize quantize_mxfp4
f8d20c59
cleanup
ae617ac0
first pass review: formatting
b0e3c620
use interleaved layout for mma
9ebe043c
mmq: add assert for size
d7edade3
use __nv_fp4x4_e2m1
97460714
use iter_k as 512, cleanup
d6cb832b
Use 1200 as blackwell instead of 1000
3fca15c2
address review comments
41329d35
mmq: fix stride
b3919910
am17an
am17an
ggerganov
am17an
am17an am17an force pushed from 5dfa63da to c5799d69 9 days ago
am17an
am17an am17an force pushed from c5799d69 to 092f8612 9 days ago
am17an am17an changed the title CUDA: experimental native mxfp4 support for blackwell [WIP] CUDA: experimental native mxfp4 support for blackwell 8 days ago
quantize.cu: use reference impl of e8m0 scale
404054c1
am17an am17an force pushed from 092f8612 to 404054c1 8 days ago
JohannesGaessler
JohannesGaessler commented on 2025-12-22
ggerganov
address review comments
5d3780d4
am17an
JohannesGaessler
JohannesGaessler approved these changes on 2025-12-24
JohannesGaessler
JohannesGaessler
JohannesGaessler commented on 2025-12-24
add 120f-virtual + minor fixes
dc04da57
CISC
am17an
JohannesGaessler
CISC
am17an
CISC
michaelw9999
fuutott
am17an
am17an am17an merged c8a2417d into master 5 days ago
am17an am17an deleted the mxfp4 branch 5 days ago
michaelw9999
michaelw9999
fuutott
am17an
JohannesGaessler
michaelw9999
JohannesGaessler
am17an
michaelw9999
CISC
JohannesGaessler
ServeurpersoCom
phaelon74
ServeurpersoCom
eugr
JohannesGaessler
eugr
eugr
phaelon74
am17an
BahamutRU
ServeurpersoCom
am17an
Panchovix
am17an
Panchovix
timkhronos
JohannesGaessler

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone