refine inference backend/code step 1 #486
refine autoround format
1d514cfa
delete example to sync main
ae14c449
[pre-commit.ci] auto fixes from pre-commit.com hooks
90811247
Merge branch 'main' into refine_auto_qunatizer
7502d605
fix some issues
47130790
Merge branch 'main' into refine_auto_qunatizer
43d02f2f
clean code
12c24ba4
Merge branch 'refine_auto_qunatizer' of https://github.com/intel/auto…
ec1b4618
fix some issue
74482ea9
support device map
949bac3e
cache backend
72d6655c
fix preci
bb58f6af
Merge branch 'main' into refine_auto_qunatizer
0eb79879
support marlin
9796bd08
fix some issues
32d15c72
refine backend a little
7a4cd718
refine backend a little
32efa294
[pre-commit.ci] auto fixes from pre-commit.com hooks
7e5616a9
Merge branch 'main' into refine_auto_qunatizer
0187f915
rm cuda code
70ff42c9
fix preci issue
00bd9212
marlin and triton kernel are basically ready
9fd5920a
Merge branch 'main' into refine_auto_qunatizer
49b6e61a
add exllamav2 kernel ut
b0ccc968
tiny change
2876dce6
fix some issues
f90e9976
fix typo
ef952d2d
dtype is done
4c307b10
Merge branch 'main' into refine_auto_qunatizer
3d0945c4
provide a workaournd for marlin offloading issue
ab6aef39
[pre-commit.ci] auto fixes from pre-commit.com hooks
58220434
fix some issues
bb8c0da3
fix some issues
ecd296e0
fix triton multiple gpu issue
c7eee0c2
wenhuach21
changed the title [WIP]refine auto quantizer refine auto quantizer step 1 343 days ago
Update auto_round/export/export_to_autogptq/export.py
797a95eb
Update auto_round/export/export_to_autoround/export.py
43197f8f
Update auto_round/export/export_to_awq/export.py
65953d03
remove gptq:marin from support formats
371c821a
wenhuach21
changed the title refine auto quantizer step 1 refine inference backend/code step 1 343 days ago
fix ut
b3fa4078
fix some issue
db69f91a
n1ck-guo
approved these changes
on 2025-04-09
fix ut and rm g_idx in packing
c6825517
recover g_idx packing for auto_gptq format
5c1e2f99
fix ut
afc8289f
Merge branch 'main' into refine_auto_qunatizer
984b8a40
wenhuach21
deleted the refine_auto_qunatizer branch 342 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub