llama.cpp
ggml-cuda: Add generic NVFP4 MMQ kernel
#21074
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
19
Changes
View On
GitHub
ggml-cuda: Add generic NVFP4 MMQ kernel
#21074
michaelw9999
wants to merge 19 commits into
ggml-org:master
from
michaelw9999:nvfp4-mmq-mma
michaelw9999
requested a review
4 days ago
github-actions
added
Nvidia GPU
github-actions
added
ggml
Introduced NVFP4 generic MMQ kernel
94e58bec
Added extra FP8 guard, hope to solve ci HIP failure
2761dcaf
michaelw9999
force pushed
from
9fd81b07
to
2761dcaf
4 days ago
am17an
commented on 2026-03-28
michaelw9999
requested a review
from
IMbackK
3 days ago
Rename tiles and use HIP_FP8_AVAILABLE
cbd9fba6
michaelw9999
force pushed
from
29b4a8d6
to
cbd9fba6
3 days ago
am17an
commented on 2026-03-28
am17an
commented on 2026-03-28
Removed remaning FP8 straggler and added const int
0d9292cf
Const
0018ce86
am17an
commented on 2026-03-28
Removed DECL_MMQ_CASE artifact
592e18cc
am17an
commented on 2026-03-28
Removed newline
1489ea54
Removed space after else
3177030d
IMbackK
commented on 2026-03-28
Changed HIP FP8 NVFP4 conversion gate
ebe28e97
Added new line to bottom of mmq.cu 270
aa55cb35
Removed extra spaces
8af43252
Removed single space in front of else on line 814
d8c5b7b6
Added NVFP4 to generate cu script so HIP can see it, further tightene…
cba8605e
github-actions
added
python
Include generated mmq-instance-nvfp4.cu
a2f724da
am17an
approved these changes on 2026-03-29
Added NVFP4 mmq to HIP Check ignore list
4be4b92b
github-actions
added
script
JohannesGaessler
commented on 2026-03-30
Update ggml/src/ggml-cuda/mmq.cuh
30d7c8c1
Update ggml/src/ggml-cuda/mmq.cuh
145d8f18
Update ggml/src/ggml-cuda/mmq.cuh
bf496f6a
Added function names to closing endif
e2babc37
Login to write a write a comment.
Login via GitHub
Reviewers
am17an
JohannesGaessler
CISC
IMbackK
Assignees
No one assigned
Labels
script
Nvidia GPU
python
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub