DeepSpeed
FP6 quantization end-to-end.
#5234
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
33
Changes
View On
GitHub
FP6 quantization end-to-end.
#5234
loadams
merged 33 commits into
master
from
features/rebase-quant-fp6
FP6 quantization end-to-end.
a4562ab7
Update CUDA kernels and clean codes.
91bb4d79
Make the quantizer on GPU.
1c2131d2
[WIP] Fix the bug of FP16-to-FP6 data packing.
1ba45fdd
Add FP6 end-to-end unit tests
ff6c3c3a
Refine the FP16-to-FP6 cast logic.
368a7630
Add unit tests for FP6 quantizer
6c45a84b
Fix FP16-FP6 cast problems.
90b710d2
Update FP6 kernels.
f8e3acfb
Fix the bug of subnormal FP6 casting and the 2bit/4bit tensor allocat…
b025c5ad
Clean code.
6ed67f77
pre-commit
20b543ca
Deal with the subnormal FP6 and FP16 values and refine the UT.
c43947a2
Update according to review comments.
a6d2f2f0
Fix the CI workflow problem for FP6 end-to-end.
62a2d495
Fix at::nullopt and at::optional conflicts.
118af370
Refine split-k setting.
56eb8b90
Remove debug files.
0ddbfd11
Only compiler the kernel body for SM >= 8.0.
35c82f25
Fix the GPU architecture requirement of FP6 kernel.
63489d17
Update deepspeed/inference/v2/config_v2.py
ed00ac92
Update deepspeed/inference/v2/config_v2.py
b15a1a10
refactor fp6 tests, fix import error
c2e6ebb9
Update deepspeed/inference/v2/modules/implementations/linear/quantize…
fb8887c9
Update requirements.txt
77f3883d
revert testing to fix A6000 test
f6bcdee0
Update pydantic version
e1a4ce04
fix pydantic import
e86611fc
Fix some review comments.
7e28144d
Pin pydantic to latest version
f8454a08
Add the missed torch import.
bed775e1
loadams
requested a review
from
mrwyattii
1 year ago
loadams
requested a review
from
awan-10
1 year ago
loadams
requested a review
from
arashb
1 year ago
loadams
requested a review
from
tjruwase
1 year ago
xiaoxiawu-microsoft
enabled auto-merge
1 year ago
arashb
requested a review
from
xiaoxiawu-microsoft
1 year ago
xiaoxiawu-microsoft
approved these changes on 2024-03-06
mrwyattii
approved these changes on 2024-03-06
arashb
approved these changes on 2024-03-06
disabled auto-merge
1 year ago
Manually disabled by user
Merge branch 'master' into features/rebase-quant-fp6
f34312ad
Merge branch 'master' into features/rebase-quant-fp6
4a917880
xiaoxiawu-microsoft
enabled auto-merge
1 year ago
disabled auto-merge
1 year ago
Manually disabled by user
loadams
merged
ccfdb84e
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
arashb
mrwyattii
xiaoxiawu-microsoft
awan-10
tjruwase
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub