DeepSpeed
DS-Inference Quantization refresh: Fix several issues and add more features
#4351

Open

DS-Inference Quantization refresh: Fix several issues and add more features #4351

RezaYazdaniAminabadi wants to merge 23 commits into master from quantization-refresh

Add the llama2 support from the official llama repo

476ca30c

add back commented function

6cabd625

add new policy & implementation for llama2

7b2142c7

add some changes to inject/run the 70b llama model

f5d987d1

remove debugging code

c2c2d6b6

remove more debugging code

165042df

Merge branch 'master' into add-llama2-support

81d692d0

formatting

24a3a0f2

Merge branch 'master' into add-llama2-support

c0ca80e8

bring back quantization and add different bits support

f611c670

Merge branch 'master' into add-llama2-support

3e3945eb

Fix DS-Inference quantization and add more bits support

9f71c2a1

RezaYazdaniAminabadi requested a review from

jeffra 2 years ago

RezaYazdaniAminabadi requested a review from

mrwyattii 2 years ago

RezaYazdaniAminabadi requested a review from

awan-10 2 years ago

RezaYazdaniAminabadi requested a review from

cmikeh2 2 years ago

RezaYazdaniAminabadi requested a review from

arashb 2 years ago

Merge branch 'add-llama2-support' into quantization-refresh

66e97f81

Merge branch 'master' into add-llama2-support

b37f7d86

use num_kv only when it has positive value

c33bc4fd

Merge branch 'add-llama2-support' of github.com:microsoft/DeepSpeed i…

ca61bd1e

use the num_kv param only if it is positive

db5a3b7f

Merge branch 'master' into add-llama2-support

d0abfdda

fix syntax and format errors.

297a15cd

fix an issue with the float32 transform kernel

a87860d2

Merge branch 'add-llama2-support' of github.com:microsoft/DeepSpeed i…

10a1df25

Merge branch 'master' into add-llama2-support

c72aa76a

Merge branch 'add-llama2-support' into quantization-refresh

e2ef102f

RezaYazdaniAminabadi changed the base branch from add-llama2-support to master 2 years ago

Reviewers

jeffra

mrwyattii

awan-10

cmikeh2

arashb

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

DeepSpeed DS-Inference Quantization refresh: Fix several issues and add more features #4351 Open

DS-Inference Quantization refresh: Fix several issues and add more features #4351

DeepSpeed
DS-Inference Quantization refresh: Fix several issues and add more features
#4351

Open