Pull Requests intel/auto-round

fix missing trust_remote_code in unfused_moe

#1532 by xin3he was merged 2026-03-11 12:19

Add markdown-link-check

#1531 by XuehaoSun was merged 2026-03-16 05:49 0.12.0

fix broken link

#1530 by xin3he was merged 2026-03-11 07:18

Optimize ShardWriter: replace O(N²) set rebuild with persistent _all_saved set

#1528 by Copilot was merged 2026-03-11 02:21

support MTP params: copy, fp8 dequant, and WOQ RTN quantization

#1527 by Copilot was merged 2026-03-11 01:44

support MTP module WOQ quantization and step3p5 model quantization

#1526 by xin3he was merged 2026-03-12 14:26

fix dynamic int8 w8a8 export issue with tuning

#1525 by thuang6 was merged 2026-03-11 08:34

Add unit tests for diffusion model eval branch in eval_cli.py

#1524 by Copilot was closed 2026-03-10 03:15

Add input validation to evaluate_diffusion_model

#1523 by Copilot was merged 2026-03-10 02:59

enable --eval for diffusion model

#1522 by xin3he was merged 2026-03-10 06:49

Enable more xpu instance for xpu CI

#1521 by chensuyue was merged 2026-03-10 01:55

fix low_gpu default value, refine doc

#1520 by WeiweiZhang1 was merged 2026-03-09 09:36

Support diffusion model saving

#1519 by mengniwang95 was merged 2026-03-10 08:26

[BUG] update moe check logic and make .gate ignore general

#1518 by xin3he was closed 2026-03-09 05:08

[BUG] update moe check logic and make .gate ignore general

#1517 by xin3he was merged 2026-03-09 08:38

fix bug of quantizing Z-image

#1516 by xin3he was merged 2026-03-10 01:50

support hadamard transform for mxfp4 with rtn or autoround method api/new

#1515 by lkk12014402 was merged 2026-03-20 12:45 0.12.0

Fix CUDA UT and HPU UT

#1514 by XuehaoSun was merged 2026-03-10 01:47

Support GLM-Image model quantizaiton

#1512 by lvliang-intel was merged 2026-03-21 09:08 0.12.0

Fix #1284: preserve FP8 format for layers specified in ignore_layers

#1511 by LuciferDono was closed 2026-03-18 08:22

support minimax_m2 ignore layer: block_sparse_moe.gate

#1508 by xin3he was merged 2026-03-09 02:55 0.10.3

[bug fix] force ignore mlp.gate

#1506 by xin3he was merged 2026-03-09 02:55 0.10.3

Update README.md: Add nightly installation instructions for auto-round

#1505 by chensuyue was merged 2026-03-09 01:02

Fix FP8 quantizer for Transformers v4 hpu

#1504 by yiliu30 was merged 2026-03-09 02:36 0.10.3

Fix Inference tensors do not track version counter error

#1502 by mengniwang95 was merged 2026-03-10 03:40 0.12.0

[bug fix] force ignore mlp.gate

#1501 by xin3he was merged 2026-03-05 09:14

release CUDA memory in WeightConverter and avoid meaningless print

#1498 by xin3he was merged 2026-03-10 01:40

Refactor build process to use 'uv build' instead of 'python setup.py'

#1495 by XuehaoSun was merged 2026-03-10 03:41

Update torch to 2.10.0 in CPU CI

#1492 by XuehaoSun was merged 2026-03-04 04:20

reduce ram&vram usage for vlm calib stage

#1488 by WeiweiZhang1 was merged 2026-03-09 07:35