intel/auto-round

Pull Requests Commits

fix bug at whole block is excluded from quantization (#156)

wenhuach21 committed 1 year ago

Verified edcec56e

auto round quantizer supports gptq kernel (#155)

wenhuach21 committed 1 year ago

Verified 9cae103d

fix qbits issue (#153)

wenhuach21 committed 1 year ago

Verified c313fa33

Qbits related log (#151)

zhewang1-intc committed 1 year ago

Verified 34274fb3

autoround_support_qbits_backend (#145)

zhewang1-intc committed 1 year ago

Verified dbdc4a39

fix incorrect setting for lm-head (#149)

wenhuach21 committed 1 year ago

Verified 9da2beed

fix triton issue (#148)

wenhuach21 committed 1 year ago

Verified 04ea8694

refine the code (#143)

wenhuach21 committed 1 year ago

Verified 59c64022

Fix exlllamav2 backend issue (#144)

wenhuach21 committed 1 year ago

Verified e614d138

Fix asym kernel issue by following autogptq's pr (#137)

wenhuach21 committed 1 year ago

Verified 794cd903

fix typos (#140)

WeiweiZhang1 committed 1 year ago

Verified 4d2d2591

bump version into v0.2 (#139)

chensuyue committed 1 year ago

Verified aafb82ef

handling transformers version compatibility in lmhead export, bugfix (#130)

WeiweiZhang1 committed 1 year ago

Verified 4db22e1d

fix export issue with torch 2.0 (#129)

wenhuach21 committed 1 year ago

Verified 5bff86ee

Update falcon recipe (#128)

wenhuach21 committed 1 year ago

Verified 416ec7e9

fix falcon quant issue with disable_trust_remote_code (#126)

WeiweiZhang1 committed 1 year ago

Verified e2985fdf

Update phi2 recipe (#124)

wenhuach21 committed 1 year ago

Verified edca2980

remove fp32 conversion in exporting to autogptq (#123)

wenhuach21 committed 1 year ago

Verified 17024b16

update gemma recipe (#121)

wenhuach21 committed 1 year ago

Verified ecc1dd65

Remove unused hook (#122)

XuehaoSun committed 1 year ago

Verified c7751c49

support `transformers.Conv1D` packing (#118)

Kaihui-intel committed 1 year ago

Verified 02a6660e

Fix export format issue (#120)

wenhuach21 committed 1 year ago

Verified ed29cf50

wenhuach21 committed 1 year ago

Verified 7bc9d8fb

fix lm-head quant issue at disable_quanted_input (#117)

wenhuach21 committed 1 year ago

Verified b51cfa98

support real lm-head quantization and mixed precision inference (#114)

wenhuach21 committed 1 year ago

Verified 4d1caebb

fix lm-head gradient accumulation bug (#113)

wenhuach21 committed 1 year ago

Verified ecca5349

update shells (#112)

WeiweiZhang1 committed 1 year ago

Verified 3c214db0

Adjust the default evaluation data type by selecting from the model path configuration (#107)

WeiweiZhang1 committed 1 year ago

Verified c42eaa92

20% speedup by removing new zero tensor (#110)

wenhuach21 committed 1 year ago

Verified c7434b63

1.8X speedup by disable_low_gpu_mem_usage and reduce memory usage by avoid using torch.cat (#106)

wenhuach21 committed 1 year ago

Verified 23b60c3a

Newer Older