DeepSpeed
Ds-inference Int8 support through ZeroQuant technology
#2217
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
25
Changes
View On
GitHub
Ds-inference Int8 support through ZeroQuant technology
#2217
jeffra
merged 25 commits into
master
from
ds-inference/ZeroQuant-Int8
Fix the layer-past for GPT based models
cf2fe011
add the Int8 support for ds-inference using ZeroQuant technology
c2cf304c
RezaYazdaniAminabadi
requested a review
from
jeffra
3 years ago
RezaYazdaniAminabadi
requested a review
from
samyam
3 years ago
RezaYazdaniAminabadi
requested a review
from
tjruwase
3 years ago
RezaYazdaniAminabadi
requested a review
from
ShadenSmith
3 years ago
RezaYazdaniAminabadi
requested a review
from
conglongli
3 years ago
RezaYazdaniAminabadi
requested a review
from
awan-10
3 years ago
RezaYazdaniAminabadi
requested a review
from
cli99
3 years ago
RezaYazdaniAminabadi
requested a review
from
eltonzheng
3 years ago
RezaYazdaniAminabadi
requested a review
from
minjiaz
3 years ago
RezaYazdaniAminabadi
requested a review
from
duli2012
3 years ago
RezaYazdaniAminabadi
requested a review
from
mrwyattii
3 years ago
RezaYazdaniAminabadi
requested a review
from
yaozhewei
3 years ago
RezaYazdaniAminabadi
requested a review
from
arashb
3 years ago
RezaYazdaniAminabadi
requested a review
from
xiaoxiawu-microsoft
3 years ago
RezaYazdaniAminabadi
requested a review
from
samadejacobs
3 years ago
fixing some issue with loading checkpoint and bias-add
d98f1f9b
adding the logic to store/restore scale for INT8 checkpoint
ebc82bb0
add empty quantization scale for different models to run with fp16
43a70230
Empty-Commit
00aa1888
Merge branch 'master' into ds-inference/ZeroQuant-Int8
9bed6452
fix sevral issues after merging with master
84e0d03b
several fixes for generating the INT8 sharded checkpoint
f6cb028d
Merge branch 'master' into ds-inference/ZeroQuant-Int8
d47bea6c
move quantizer declaration before inference branch
cb72d9ce
Merge branch 'master' into ds-inference/ZeroQuant-Int8
32b93224
fixing some part to catch up with latest update on HF side
57779eff
Merge branch 'ds-inference/ZeroQuant-Int8' of github.com:microsoft/De…
f4e48e60
reducing the CPU memory usage when loading checkpoint (this solves th…
dbcb6ec5
some minor modification to the ckpt names
cd80eccb
remove masking and some configuration changes
82a37d6d
remove dead code
9d126561
Merge branch 'master' into ds-inference/ZeroQuant-Int8
4ae356e4
RezaYazdaniAminabadi
enabled auto-merge (squash)
3 years ago
Merge branch 'master' into ds-inference/ZeroQuant-Int8
d7ff3647
fix some issue with int8 ckpt-loading
b17a3b59
Merge branch 'master' into ds-inference/ZeroQuant-Int8
a541e52b
Merge branch 'master' into ds-inference/ZeroQuant-Int8
2845bad3
Merge branch 'master' into ds-inference/ZeroQuant-Int8
c77f5e0e
change the mp_size to tp_size at inference config & add some doc-stri…
f3f4b1dd
jeffra
approved these changes on 2022-08-30
disabled auto-merge
3 years ago
Manually disabled by user
jeffra
merged
afdc7287
into master
3 years ago
jeffra
deleted the ds-inference/ZeroQuant-Int8 branch
3 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
jeffra
samyam
tjruwase
ShadenSmith
conglongli
awan-10
cli99
eltonzheng
minjiaz
duli2012
mrwyattii
yaozhewei
arashb
xiaoxiawu-microsoft
samadejacobs
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub