Pull Requests intel/auto-round

fix google/gemma-3-4b-it

#1547 by xin3he was merged 2026-03-16 09:47 0.12.0

Fix sglang ut

#1541 by mengniwang95 was merged 2026-03-13 04:38

fix alg_ext torch compile issue

#1540 by wenhuach21 was merged 2026-03-13 10:32

Fix CI test_transformers issue caused by transformers 5.3.0

#1535 by lvliang-intel was merged 2026-03-13 05:25

[WIP] WIP address feedback on missing trust_remote_code in unfused_moe

#1533 by Copilot was closed 2026-03-11 08:19

fix missing trust_remote_code in unfused_moe

#1532 by xin3he was merged 2026-03-11 12:19

Add markdown-link-check

#1531 by XuehaoSun was merged 2026-03-16 05:49 0.12.0

fix broken link

#1530 by xin3he was merged 2026-03-11 07:18

Optimize ShardWriter: replace O(N²) set rebuild with persistent _all_saved set

#1528 by Copilot was merged 2026-03-11 02:21

support MTP params: copy, fp8 dequant, and WOQ RTN quantization

#1527 by Copilot was merged 2026-03-11 01:44

support MTP module WOQ quantization and step3p5 model quantization

#1526 by xin3he was merged 2026-03-12 14:26

fix dynamic int8 w8a8 export issue with tuning

#1525 by thuang6 was merged 2026-03-11 08:34

Add unit tests for diffusion model eval branch in eval_cli.py

#1524 by Copilot was closed 2026-03-10 03:15

Add input validation to evaluate_diffusion_model

#1523 by Copilot was merged 2026-03-10 02:59

enable --eval for diffusion model

#1522 by xin3he was merged 2026-03-10 06:49

Enable more xpu instance for xpu CI

#1521 by chensuyue was merged 2026-03-10 01:55

fix low_gpu default value, refine doc

#1520 by WeiweiZhang1 was merged 2026-03-09 09:36

Support diffusion model saving

#1519 by mengniwang95 was merged 2026-03-10 08:26

[BUG] update moe check logic and make .gate ignore general

#1518 by xin3he was closed 2026-03-09 05:08

[BUG] update moe check logic and make .gate ignore general

#1517 by xin3he was merged 2026-03-09 08:38

fix bug of quantizing Z-image

#1516 by xin3he was merged 2026-03-10 01:50

Fix CUDA UT and HPU UT

#1514 by XuehaoSun was merged 2026-03-10 01:47

support minimax_m2 ignore layer: block_sparse_moe.gate

#1508 by xin3he was merged 2026-03-09 02:55 0.10.3

[bug fix] force ignore mlp.gate

#1506 by xin3he was merged 2026-03-09 02:55 0.10.3

Update README.md: Add nightly installation instructions for auto-round

#1505 by chensuyue was merged 2026-03-09 01:02

Fix FP8 quantizer for Transformers v4 hpu

#1504 by yiliu30 was merged 2026-03-09 02:36 0.10.3

Fix Inference tensors do not track version counter error

#1502 by mengniwang95 was merged 2026-03-10 03:40 0.12.0

[bug fix] force ignore mlp.gate

#1501 by xin3he was merged 2026-03-05 09:14

release CUDA memory in WeightConverter and avoid meaningless print

#1498 by xin3he was merged 2026-03-10 01:40

Refactor build process to use 'uv build' instead of 'python setup.py'

#1495 by XuehaoSun was merged 2026-03-10 03:41