Speedup model init on CPU (by 10x+ for llama-3-8B as one example) #31771
muellerzr
changed the title Speedup model loading by 1,100%+! Speedup model loading (by 1,100%+) and .generate() on CPU (by 925%+)! 1 year ago
muellerzr
changed the title Speedup model loading (by 1,100%+) and .generate() on CPU (by 925%+)! Speedup model loading (by ~10x) and .generate() on CPU (by ~10x)! 1 year ago
muellerzr
changed the title Speedup model loading (by ~10x) and .generate() on CPU (by ~10x)! Speedup model init on CPU (by 30x+) 1 year ago
muellerzr
changed the title Speedup model init on CPU (by 30x+) Speedup model init on CPU (by 30x+ for llama-3-70B) 1 year ago
muellerzr
changed the title Speedup model init on CPU (by 30x+ for llama-3-70B) Speedup model init on CPU (by 30x+ for llama-3-70B as one example) 1 year ago
muellerzr
force pushed
to
27250296
1 year ago
muellerzr
force pushed
to
53987886
1 year ago
muellerzr
force pushed
to
5141f2b0
1 year ago
muellerzr
force pushed
to
0afc6c27
1 year ago
muellerzr
changed the title Speedup model init on CPU (by 30x+ for llama-3-70B as one example) Speedup model init on CPU (by 2x+ for llama-3-8B as one example) 1 year ago
muellerzr
changed the title Speedup model init on CPU (by 2x+ for llama-3-8B as one example) Speedup model init on CPU (by 10x+ for llama-3-8B as one example) 1 year ago
1,100%!
c3e49a8c
Clean
e3bcff23
Don't touch DS
248910a3
Experiment with dtype allocation
08df7460
skip test_load_save_without_tied_weights test
f1408369
A little faster
b3373483
Include proper upscaling?
9f45f625
Fixup tests
dce912e7
Potentially skip?
f62d4591
Let's see if this fixes git history
7ebb3e9c
Maintain new dtype
bef3a80f
Fin
ca1010ec
Rm hook idea for now
989612fb
New approach, see what breaks
9fc7e8b4
stage
79578eaf
Clean
639df3b4
Stash
cab132bd
Should be fin now, just need to mark failing models
8338e2a3
Clean up
67c52a01
Simplify
20072493
Deal with weird models
6f2e6505
Enc/Dec
6cdae656
Skip w/ reason
35696f67
Adjust test
0ece40be
Fix test
6946f86a
one more test
f3f751c1
Keep experimenting
a7c2a83f
Fix ref
178cb143
TO REMOVE: testing feedback CI
48be6f8b
Right push
02c38fe2
Update tests/utils/test_modeling_utils.py
74fdf4be
disable
38d0e894
Add new func
43359560
muellerzr
force pushed
to
43359560
1 year ago
Test nits from Amy
9c5dc50e
Update src/transformers/modeling_utils.py
c491952d
Merge branch 'muellerzr-speedup-inference' of https://github.com/hugg…
fd3890ac
Adjust comment
e8f4a148
Adjust comment on skip
512f34ad
make private
ada401f4
Fin
1e5466a8
Should be a not flag
70448cdf
Clarify and rename test
21af73ad
muellerzr
merged
e0dfd7bc
into main 1 year ago
muellerzr
deleted the muellerzr-speedup-inference branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub