auto-round
7428cfa1 - Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8

Commit
1 day ago
Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8 Patch FineGrainedFP8HfQuantizer to support per-tensor static FP8 models (weight_scale + input_scale) that transformers 4.52 doesn't natively handle. All patches are monkey-patches in torch_gen.py, no source code changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Parents
Loading