DeepSpeed
DS-Inference Quantization refresh: Fix several issues and add more features
#4351
Open

DS-Inference Quantization refresh: Fix several issues and add more features #4351

RezaYazdaniAminabadi wants to merge 23 commits into master from quantization-refresh
RezaYazdaniAminabadi
Add the llama2 support from the official llama repo
476ca30c
add back commented function
6cabd625
add new policy & implementation for llama2
7b2142c7
add some changes to inject/run the 70b llama model
f5d987d1
remove debugging code
c2c2d6b6
remove more debugging code
165042df
RezaYazdaniAminabadi Merge branch 'master' into add-llama2-support
81d692d0
mrwyattii formatting
24a3a0f2
RezaYazdaniAminabadi Merge branch 'master' into add-llama2-support
c0ca80e8
bring back quantization and add different bits support
f611c670
mrwyattii Merge branch 'master' into add-llama2-support
3e3945eb
Fix DS-Inference quantization and add more bits support
9f71c2a1
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from jeffra jeffra 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from awan-10 awan-10 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from cmikeh2 cmikeh2 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from arashb arashb 2 years ago
RezaYazdaniAminabadi Merge branch 'add-llama2-support' into quantization-refresh
66e97f81
RezaYazdaniAminabadi Merge branch 'master' into add-llama2-support
b37f7d86
use num_kv only when it has positive value
c33bc4fd
Merge branch 'add-llama2-support' of github.com:microsoft/DeepSpeed i…
ca61bd1e
use the num_kv param only if it is positive
db5a3b7f
awan-10 Merge branch 'master' into add-llama2-support
d0abfdda
awan-10 fix syntax and format errors.
297a15cd
fix an issue with the float32 transform kernel
a87860d2
Merge branch 'add-llama2-support' of github.com:microsoft/DeepSpeed i…
10a1df25
RezaYazdaniAminabadi Merge branch 'master' into add-llama2-support
c72aa76a
RezaYazdaniAminabadi Merge branch 'add-llama2-support' into quantization-refresh
e2ef102f
RezaYazdaniAminabadi RezaYazdaniAminabadi changed the base branch from add-llama2-support to master 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone