PR #1182 Enable qwen3 vl moe quant and load

refine update_fused_layer_global_scales to fix device mismatch for nvfp UT

WeiweiZhang1 committed 122 days ago

enable qwen3_vl_moe quantization & quantized model loading

WeiweiZhang1 committed 114 days ago

[pre-commit.ci] auto fixes from pre-commit.com hooks

pre-commit-ci[bot] committed 114 days ago

fixtypo

WeiweiZhang1 committed 114 days ago

Merge branch 'enable_qwen3_vl_moe_quant' of https://github.com/intel/auto-round into enable_qwen3_vl_moe_quant

WeiweiZhang1 committed 114 days ago

Merge branch 'main' into enable_qwen3_vl_moe_quant

yiliu30 committed 113 days ago

Update auto_round/modelling/qwen3_vl_moe.py

WeiweiZhang1 committed 113 days ago

set calib_all_experts to false

WeiweiZhang1 committed 113 days ago

fix typo

WeiweiZhang1 committed 113 days ago

auto-round Enable qwen3 vl moe quant and load #1182 Merged