llama.cpp
ggml-cuda: Add generic NVFP4 MMQ kernel
#21074
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
19
Changes
View On
GitHub
ggml-cuda: Add generic NVFP4 MMQ kernel
#21074
JohannesGaessler
merged 19 commits into
ggml-org:master
from
michaelw9999:nvfp4-mmq-mma
michaelw9999
requested a review
13 days ago
github-actions
added
Nvidia GPU
github-actions
added
ggml
Introduced NVFP4 generic MMQ kernel
94e58bec
Added extra FP8 guard, hope to solve ci HIP failure
2761dcaf
michaelw9999
force pushed
from
9fd81b07
to
2761dcaf
12 days ago
am17an
commented on 2026-03-28
michaelw9999
requested a review
from
IMbackK
12 days ago
Rename tiles and use HIP_FP8_AVAILABLE
cbd9fba6
michaelw9999
force pushed
from
29b4a8d6
to
cbd9fba6
12 days ago
am17an
commented on 2026-03-28
am17an
commented on 2026-03-28
Removed remaning FP8 straggler and added const int
0d9292cf
Const
0018ce86
am17an
commented on 2026-03-28
Removed DECL_MMQ_CASE artifact
592e18cc
am17an
commented on 2026-03-28
Removed newline
1489ea54
Removed space after else
3177030d
IMbackK
commented on 2026-03-28
Changed HIP FP8 NVFP4 conversion gate
ebe28e97
Added new line to bottom of mmq.cu 270
aa55cb35
Removed extra spaces
8af43252
Removed single space in front of else on line 814
d8c5b7b6
Added NVFP4 to generate cu script so HIP can see it, further tightene…
cba8605e
github-actions
added
python
Include generated mmq-instance-nvfp4.cu
a2f724da
am17an
approved these changes on 2026-03-29
Added NVFP4 mmq to HIP Check ignore list
4be4b92b
github-actions
added
script
JohannesGaessler
commented on 2026-03-30
Update ggml/src/ggml-cuda/mmq.cuh
30d7c8c1
Update ggml/src/ggml-cuda/mmq.cuh
145d8f18
Update ggml/src/ggml-cuda/mmq.cuh
bf496f6a
Added function names to closing endif
e2babc37
JohannesGaessler
approved these changes on 2026-04-01
JohannesGaessler
merged
84f82e84
into master
8 days ago
michaelw9999
deleted the nvfp4-mmq-mma branch
8 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
JohannesGaessler
am17an
CISC
IMbackK
Assignees
No one assigned
Labels
script
Nvidia GPU
python
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub