llama.cpp
ggml : add NVFP4 quantization type support for metal
#20060
Open

ggml : add NVFP4 quantization type support for metal #20060

richarddd wants to merge 44 commits into ggml-org:master from richarddd:feat/nvfp4-metal
richarddd
richarddd WIP: add NVFP4 quantization support
f5a137d7
richarddd tests
8b4e790e
richarddd improve NVFP4 dot product implementation performance and fix bad sup…
e3e13301
richarddd typo
cfe06795
richarddd Use nvfp4 kvalues
984aaee7
richarddd vulkan : fix NVFP4 shader compilation by including kvalues_mxfp4 look…
cd84cc3d
richarddd vulcal and perf fixes
befad80d
richarddd wip
cf1d533a
richarddd Fix metal
7c730baf
richarddd fix vulcan
622a6e8f
richarddd Rename threshold & fix wrong scale
3c6f4cae
richarddd Fix MOE
06e14c5c
richarddd Shelf backend implementations (CUDA, Metal, Vulkan, arch-specific SIMD)
04870346
richarddd Fix arch-fallback.h: add NVFP4 generic fallback for all platforms
a8f8fbaa
richarddd quantize: add NVFP4 as a quantization type option
fe52c511
richarddd Fix ggml_fp32_to_ue4m3: handle subnormal values
4f232bee
richarddd Restore ARM NEON NVFP4 dot product implementation
dc5a0228
richarddd Optimize ARM NEON NVFP4 dot product: LUT + vpaddq + vfmaq
b99855e6
richarddd ARM NEON NVFP4: rearrange q8 to match nibble layout
5951d107
richarddd CPU only backend 64 super-block layout
36491e40
richarddd cleanup
fa018357
richarddd Remove unused LUT
68a6e2d7
richarddd int
ee52fdd1
richarddd exclude NVFP4 from unsupported ops in metal build
81218b23
richarddd remove quantization for now
a27ee0d6
richarddd store scales as native UE4M3, preserve original model bits when possible
73bd0f4f
richarddd Update convert_hf_to_gguf.py
d6d3368b
richarddd correct comment
b0c75e22
richarddd format
a26f2fc7
richarddd reduce duplication and cleanup
6e434346
richarddd Address comments
52b9baa3
richarddd move detection to prepare_tensors
0519bfcc
richarddd Use math instead of const
2009a9c2
richarddd Move
733cac68
richarddd fix comment
ff1eec6d
richarddd Shelf quantize tests
3f97de2f
richarddd Rebase and move check
b5912f25
richarddd cleanup
fa669191
richarddd lint
9fa4ddc1
richarddd Update gguf-py/gguf/scripts/gguf_convert_endian.py
f75235a1
richarddd Use fallback quant config
27cf7483
richarddd Metal support for NVFP4
0964096c
richarddd richarddd requested a review from ggerganov ggerganov 6 days ago
richarddd richarddd requested a review from CISC CISC 6 days ago
github-actions github-actions added testing
github-actions github-actions added python
github-actions github-actions added ggml
github-actions github-actions added Apple Metal
richarddd These should not be shelved
2e96fb42
richarddd Format
da39e58e
michaelw9999
richarddd
ggerganov
ggerganov ggerganov marked this pull request as draft 5 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone