transformers
Speedup model init on CPU (by 10x+ for llama-3-8B as one example)
#31771
Merged

Speedup model init on CPU (by 10x+ for llama-3-8B as one example) #31771

muellerzr merged 42 commits into main from muellerzr-speedup-inference
muellerzr
HuggingFaceDocBuilderDev
muellerzr
muellerzr
muellerzr muellerzr changed the title Speedup model loading by 1,100%+! Speedup model loading (by 1,100%+) and .generate() on CPU (by 925%+)! 1 year ago
muellerzr
amyeroberts
amyeroberts approved these changes on 2024-07-03
muellerzr
muellerzr commented on 2024-07-03
muellerzr
muellerzr
gante
muellerzr muellerzr changed the title Speedup model loading (by 1,100%+) and .generate() on CPU (by 925%+)! Speedup model loading (by ~10x) and .generate() on CPU (by ~10x)! 1 year ago
muellerzr
muellerzr
SunMarc
muellerzr
tjruwase
muellerzr
tjruwase
muellerzr
tjruwase
tjruwase
muellerzr
muellerzr
muellerzr muellerzr changed the title Speedup model loading (by ~10x) and .generate() on CPU (by ~10x)! Speedup model init on CPU (by 30x+) 1 year ago
muellerzr
muellerzr muellerzr changed the title Speedup model init on CPU (by 30x+) Speedup model init on CPU (by 30x+ for llama-3-70B) 1 year ago
muellerzr muellerzr changed the title Speedup model init on CPU (by 30x+ for llama-3-70B) Speedup model init on CPU (by 30x+ for llama-3-70B as one example) 1 year ago
muellerzr
ArthurZucker
ArthurZucker commented on 2024-07-09
ArthurZucker ArthurZucker added run-slow
ArthurZucker
muellerzr muellerzr force pushed to 27250296 1 year ago
muellerzr
ArthurZucker
ArthurZucker
muellerzr
SunMarc
muellerzr
muellerzr commented on 2024-07-10
muellerzr
muellerzr
muellerzr muellerzr force pushed to 53987886 1 year ago
muellerzr muellerzr force pushed to 5141f2b0 1 year ago
muellerzr
muellerzr muellerzr force pushed to 0afc6c27 1 year ago
muellerzr muellerzr changed the title Speedup model init on CPU (by 30x+ for llama-3-70B as one example) Speedup model init on CPU (by 2x+ for llama-3-8B as one example) 1 year ago
muellerzr muellerzr changed the title Speedup model init on CPU (by 2x+ for llama-3-8B as one example) Speedup model init on CPU (by 10x+ for llama-3-8B as one example) 1 year ago
muellerzr muellerzr removed run-slow
muellerzr muellerzr added run-slow
muellerzr
muellerzr muellerzr requested a review from ArthurZucker ArthurZucker 1 year ago
muellerzr muellerzr requested a review from amyeroberts amyeroberts 1 year ago
ArthurZucker
ArthurZucker commented on 2024-07-15
muellerzr 1,100%!
c3e49a8c
muellerzr Clean
e3bcff23
muellerzr Don't touch DS
248910a3
muellerzr Experiment with dtype allocation
08df7460
SunMarc skip test_load_save_without_tied_weights test
f1408369
muellerzr A little faster
b3373483
muellerzr Include proper upscaling?
9f45f625
muellerzr Fixup tests
dce912e7
muellerzr Potentially skip?
f62d4591
muellerzr Let's see if this fixes git history
7ebb3e9c
muellerzr Maintain new dtype
bef3a80f
muellerzr Fin
ca1010ec
muellerzr Rm hook idea for now
989612fb
muellerzr New approach, see what breaks
9fc7e8b4
muellerzr stage
79578eaf
muellerzr Clean
639df3b4
muellerzr Stash
cab132bd
muellerzr Should be fin now, just need to mark failing models
8338e2a3
muellerzr Clean up
67c52a01
muellerzr Simplify
20072493
muellerzr Deal with weird models
6f2e6505
muellerzr Enc/Dec
6cdae656
muellerzr Skip w/ reason
35696f67
muellerzr Adjust test
0ece40be
muellerzr Fix test
6946f86a
muellerzr one more test
f3f751c1
muellerzr Keep experimenting
a7c2a83f
muellerzr Fix ref
178cb143
muellerzr TO REMOVE: testing feedback CI
48be6f8b
muellerzr Right push
02c38fe2
muellerzr Update tests/utils/test_modeling_utils.py
74fdf4be
muellerzr disable
38d0e894
muellerzr Add new func
43359560
muellerzr muellerzr force pushed to 43359560 1 year ago
amyeroberts
amyeroberts commented on 2024-07-16
muellerzr Test nits from Amy
9c5dc50e
muellerzr Update src/transformers/modeling_utils.py
c491952d
muellerzr Merge branch 'muellerzr-speedup-inference' of https://github.com/hugg…
fd3890ac
muellerzr Adjust comment
e8f4a148
muellerzr Adjust comment on skip
512f34ad
muellerzr make private
ada401f4
muellerzr Fin
1e5466a8
muellerzr muellerzr requested a review from amyeroberts amyeroberts 1 year ago
muellerzr muellerzr requested a review from stevhliu stevhliu 1 year ago
muellerzr Should be a not flag
70448cdf
muellerzr Clarify and rename test
21af73ad
muellerzr
amyeroberts
amyeroberts approved these changes on 2024-07-16
muellerzr muellerzr merged e0dfd7bc into main 1 year ago
muellerzr muellerzr deleted the muellerzr-speedup-inference branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone