Faster generation using AWQ + Fused modules #27411
v1 fusing modules
6c995f9f
add fused mlp support
85cc9c74
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
b6cd5549
up
7ffbaa3e
fix CI
05b5f62e
block save_pretrained
8670aa20
fixup
9ee6b381
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
f8d41775
small fix
1a8c9156
add new condition
b541b4dd
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
2ea1f470
Merge branch 'awq-fused-modules' of https://github.com/younesbelkada/…
024b737d
add v1 docs
a7d74f80
add some comments
85e1e3b7
Merge branch 'main' into awq-fused-modules
3e6ba9bf
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
26194d01
style
f160a162
fix nit
14c820d3
adapt from suggestion
03d8dff6
add check
0a08551c
change arg names
234165f6
change variables name
03980d92
Update src/transformers/integrations/awq.py
8a68a232
style
21f68794
split up into 3 different private methods
cde53efe
more conditions
8517e325
more checks
b187c070
add fused tests for custom models
c3e32ab9
fix
d3c77538
fix tests
4113c45e
final update docs
0bd1b0ca
younesbelkada
marked this pull request as ready for review 2 years ago
final fixes
61db4309
fix importlib metadata
cd37d323
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
8f381edc
Merge branch 'awq-fused-modules' of https://github.com/younesbelkada/…
e80ad756
Update src/transformers/utils/quantization_config.py
b5c337cc
change it to `do_fuse`
3f98913d
nit
3bd0446a
Update src/transformers/utils/quantization_config.py
e1b3bfa4
Update src/transformers/utils/quantization_config.py
cb315465
Update src/transformers/utils/quantization_config.py
45875fd7
Merge branch 'awq-fused-modules' of https://github.com/younesbelkada/…
faaa255d
few fixes
c1ea9b2e
revert
d90eec75
fix test
e65687b7
fix copies
da78cf45
SunMarc
approved these changes
on 2023-12-04
Merge remote-tracking branch 'upstream/main' into awq-fused-modules
2fcc465c
raise error if model is not quantized
06976877
add test
12aff7c3
use quantization_config.config when fusing
498fe55f
Update src/transformers/modeling_utils.py
196095ed
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub