intel/auto-round

Pull Requests Commits

Copilot committed 133 days ago

deba2171

allow_deprecated_quantization and simplify UT to reduce time

xin3he committed 133 days ago

a949c5cd

refactor eval and add UT

root committed 134 days ago

8514a7dc

open source delta loss (#1300)

wenhuach21 committed 135 days ago

Verified 587eadb8

fix ignore layers regression (#1302)

wenhuach21 committed 136 days ago

Verified 3352a830

Remove itrex format (#1301)

Kaihui-intel committed 136 days ago

Verified 46f88606

refine moe modellings to release orginal expert module's ram (#1265)

WeiweiZhang1 committed 137 days ago

Verified e383eea3

Update unit test script (#1290)

XuehaoSun committed 137 days ago

Verified 8e6f3370

Update version (#1282)

XuehaoSun committed 137 days ago

Verified 47c8aa4c

auto-round-kernel installation method (#1221)

chensuyue committed 141 days ago

Verified ba48dd77

[vllm-ext] allows setting flashinfer workspace (#1259)

yiliu30 committed 141 days ago

Verified 6e1ea9cb

add permission for workflow (#1190)

chensuyue committed 142 days ago

Verified 94db9057

Update HPU CI to gaudi v1.23.0 (#1271)

XuehaoSun committed 142 days ago

Verified 56750e8a

gguf fix bug of tensors not on same tensor and gguf:q2_k_mixed (#1263)

n1ck-guo committed 142 days ago

Verified 2c67712f

Fix critic bug causing GGUF to run on CPU (#1260)

wenhuach21 committed 144 days ago

Verified 1b7535a0

update supported scheme readme (#1257)

wenhuach21 committed 144 days ago

Verified 6a87a651

Update nightly release workflow (#1256)

chensuyue committed 144 days ago

Verified 583278d9

Add nightly release workflow (#1255)

chensuyue committed 144 days ago

Verified b4ab21d7

GGUF format add support for MoE models with non-linear expert layers. (#1244)

n1ck-guo committed 144 days ago

Verified c7d6aee6

fix cuda ut fail (#1237)

n1ck-guo committed 144 days ago

Verified 7a46aee8

Fix low cpu speed issue (#1251)

wenhuach21 committed 147 days ago

Verified 90db4445

WNA16 does not apply optimized RTN for moe layers by default (#1245)

wenhuach21 committed 147 days ago

Verified 9588bf90

Organize test directory structure with logical categorization (#1243)

Copilot committed 147 days ago

Verified 0f6dc76e

Update setting of disable opt rtn (#1249)

WeiweiZhang1 committed 147 days ago

Verified cc521cab

set disable_opt_rtn to optional bool and change default value to None (#1231)

WeiweiZhang1 committed 148 days ago

Verified bef28c9a

[HPU]Export lazy mode env explicit (#1239)

yiliu30 committed 148 days ago

Verified 001d0e39

[API Change]rename fp_layers to ignore_layers (#1233)

wenhuach21 committed 148 days ago

Verified 0ad224c9

Add FP8 dtype info to quant config (#1228)

yiliu30 committed 149 days ago

Verified afea815e

gguf add support for Mistral-Magistral-Devstral and Granite (#1226)

n1ck-guo committed 149 days ago

Verified 6f2f9b9e

[STEP 2] refactor format and export (#1192)

n1ck-guo committed 150 days ago

Verified c2449059

Older