llama.cpp
Force NVFP4 W4A8 path for NVFP4_W4A16 layers on Blackwell, where NVFP4 normally uses the native W4A4 path.
#24364

Open

Force NVFP4 W4A8 path for NVFP4_W4A16 layers on Blackwell, where NVFP4 normally uses the native W4A4 path. #24364

ynankani wants to merge 2 commits into ggml-org:master from ynankani:ynankani/Force_W4A16_NVFP4_to_W4A8

ynankani requested a review from

ggerganov 12 days ago

ynankani requested a review from

CISC 12 days ago

ynankani requested a review 12 days ago

github-actions added testing

github-actions added Nvidia GPU

github-actions added python

github-actions added ggml

sanmai commented on 2026-06-10

am17an commented on 2026-06-10

Force NVFP4 W4A8 path for NVFP4_W4A16 layers

dfee78d3

Add a Knob to allow W4A4 for user, even if checkpoint specifies W4A16…

18f1df39

ynankani force pushed from b72a8c94 to 18f1df39 4 days ago

github-actions added documentation

github-actions added CUDA

Reviewers

ORippler

am17an

michaelw9999

sanmai

ggerganov

CISC

Assignees

No one assigned

Labels

documentation testing Nvidia GPU python ggml CUDA

Milestone

No milestone