DeepSpeed
Ds-inference Int8 support through ZeroQuant technology
#2217
Merged

Ds-inference Int8 support through ZeroQuant technology #2217

jeffra merged 25 commits into master from ds-inference/ZeroQuant-Int8
RezaYazdaniAminabadi
Fix the layer-past for GPT based models
cf2fe011
add the Int8 support for ds-inference using ZeroQuant technology
c2cf304c
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from jeffra jeffra 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from samyam samyam 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from tjruwase tjruwase 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from ShadenSmith ShadenSmith 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from conglongli conglongli 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from awan-10 awan-10 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from cli99 cli99 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from eltonzheng eltonzheng 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from minjiaz minjiaz 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from duli2012 duli2012 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from yaozhewei yaozhewei 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from arashb arashb 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from xiaoxiawu-microsoft xiaoxiawu-microsoft 3 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from samadejacobs samadejacobs 3 years ago
fixing some issue with loading checkpoint and bias-add
d98f1f9b
adding the logic to store/restore scale for INT8 checkpoint
ebc82bb0
add empty quantization scale for different models to run with fp16
43a70230
Empty-Commit
00aa1888
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
9bed6452
fix sevral issues after merging with master
84e0d03b
sdpmas
RezaYazdaniAminabadi
RezaYazdaniAminabadi
sdpmas
sdpmas
several fixes for generating the INT8 sharded checkpoint
f6cb028d
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
d47bea6c
move quantizer declaration before inference branch
cb72d9ce
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
32b93224
fixing some part to catch up with latest update on HF side
57779eff
Merge branch 'ds-inference/ZeroQuant-Int8' of github.com:microsoft/De…
f4e48e60
reducing the CPU memory usage when loading checkpoint (this solves th…
dbcb6ec5
some minor modification to the ckpt names
cd80eccb
remove masking and some configuration changes
82a37d6d
remove dead code
9d126561
mayank31398
mayank31398
mayank31398
RezaYazdaniAminabadi
jeffra Merge branch 'master' into ds-inference/ZeroQuant-Int8
4ae356e4
RezaYazdaniAminabadi
mayank31398
mayank31398
RezaYazdaniAminabadi
RezaYazdaniAminabadi RezaYazdaniAminabadi enabled auto-merge (squash) 3 years ago
RezaYazdaniAminabadi
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
d7ff3647
fix some issue with int8 ckpt-loading
b17a3b59
mayank31398
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
a541e52b
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
2845bad3
RezaYazdaniAminabadi Merge branch 'master' into ds-inference/ZeroQuant-Int8
c77f5e0e
change the mp_size to tp_size at inference config & add some doc-stri…
f3f4b1dd
jeffra
jeffra approved these changes on 2022-08-30
disabled auto-merge 3 years ago
Manually disabled by user
jeffra jeffra merged afdc7287 into master 3 years ago
jeffra jeffra deleted the ds-inference/ZeroQuant-Int8 branch 3 years ago
pai4451
mayank31398
pai4451
pai4451
RezaYazdaniAminabadi
RezaYazdaniAminabadi
pai4451
xk503775229
stas00
wanghaoshuang
zcrypt0
mayank31398
zcrypt0
mayank31398
JingfengYang
jeffra
JingfengYang
jeffra
liangxiaoyun
Tracin
kiucho
kiucho

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone