llama.cpp
ggml-cuda: Repost of 21896: Blackwell native NVFP4 support
#22196
Merged

ggml-cuda: Repost of 21896: Blackwell native NVFP4 support #22196

michaelw9999
michaelw9999 Blackwell NVFP4 MMQ Kernel
a0818450
michaelw9999 Removed whitespace
9fb7e840
michaelw9999 Added FP8 Max definition and description
0bcf7b29
michaelw9999 Fixed 'f' typo
4625a7cc
michaelw9999 Removed whitespace from comment
3ea6b59d
michaelw9999 Guard Blackwell NVFP4 quantizer for Blackwell only
db5957e7
michaelw9999 Merged vec_dot_fp4_fp4_mma together
83b412f0
michaelw9999 Refactored to use 76-byte MMQ_MMA_TILE_X_K_FP4 and block_fp4_mmq inst…
c3188065
michaelw9999 Updated block_fp4_mmq packing comment
78596bfa
michaelw9999 Added assert for QK_K == 8 * QK_MXFP4 in mul_mat_q
a68327c7
michaelw9999 Removed extra space typo
6e31a22b
michaelw9999 Changed NVFP4 quant assert and using get_int_b4
58e277e4
michaelw9999 Removed bool has_ids template from quantize
0e2c7948
michaelw9999 Updated block_fp4_mmq packing comment
72fc0170
michaelw9999 Added ue4m3 bounds check for testscale
7fcc8c07
michaelw9999 Removed whitespace on line 52 of mmq.cuh
7c73198d
michaelw9999 Fixed MMQ_ITER_K_FP4 returning on non-FP4 models when running on Blac…
6b26a1c7
michaelw9999 Change GGML_ASSERT to static_assert
e34b6ff6
michaelw9999 Whitespace fixes
02df2638
michaelw9999 Change amax_raw mul 1/6 to: / 6
92045908
michaelw9999 Hoisted kbx0 and kbx out of the loop
667cc38d
michaelw9999 Update ggml/src/ggml-cuda/mmq.cuh
553c3a85
michaelw9999 Add endif blackwell mma comment
0d9e0458
michaelw9999 michaelw9999 requested a review from ggerganov ggerganov 58 days ago
michaelw9999 michaelw9999 requested a review 58 days ago
github-actions github-actions added testing
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
anskumar01
am17an
JohannesGaessler
am17an
michaelw9999
michaelw9999 michaelw9999 marked this pull request as draft 52 days ago
ORippler
am17an
am17an approved these changes on 2026-04-28
am17an
JohannesGaessler
JohannesGaessler approved these changes on 2026-04-28
michaelw9999 michaelw9999 marked this pull request as ready for review 51 days ago
am17an am17an merged fc2b0053 into master 50 days ago
ORippler
michaelw9999

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone