auto-round
7428cfa1 - Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8

Commit

1 day ago

Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8 Patch FineGrainedFP8HfQuantizer to support per-tensor static FP8 models (weight_scale + input_scale) that transformers 4.52 doesn't natively handle. All patches are monkey-patches in torch_gen.py, no source code changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

References

fp8-static-quant-patch

Author

yiliu30

Parents

7e77043e

auto-round 7428cfa1 - Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8

auto-round
7428cfa1 - Add out-of-tree FP8 static quantization patches for Hunyuan-A13B-Instruct-FP8