DeepSpeed
OptimizedLinear updates
#5791
Merged

OptimizedLinear updates #5791

loadams merged 20 commits into deepspeedai:master from Snowflake-Labs:ds-llama
jeffra
sfc-gh-reyazda Add fp8-fused gemm kernel
4c3b8fd5
sfc-gh-reyazda add get_scale function
c0e97f1c
sfc-gh-reyazda fix a few things to run the test
cb0e0a65
sfc-gh-reyazda Merge branch 'master' into add-fp8-gemm
a11f9c5e
jeffra fixes for optim linear
4169b137
jeffra progress
58967436
jeffra lora fixes + initial ckpt signal
e600a385
jeffra base_weight -> weight
ef52cd1e
jeffra use flattened tensors for BWS
a170fdd9
sfc-gh-reyazda fix illegal memory corner cases with an extra condition for reading s…
390a984f
sfc-gh-reyazda reduce memory pressure
b43c242b
jeffra more changes
40add9ea
jeffra jeffra requested a review from tjruwase tjruwase 1 year ago
jeffra jeffra requested a review from awan-10 awan-10 1 year ago
jeffra jeffra requested a review from arashb arashb 1 year ago
jeffra jeffra requested a review from loadams loadams 1 year ago
sfc-gh-reyazda small fix for fp16 quantization
057ce52f
winglian
winglian commented on 2024-07-25
jeffra ds lora injection api support (#8)
966ebd4f
jeffra Merge branch 'master' into ds-llama
6ec4eada
jeffra various clean-up
fe6b082f
jeffra updates for tests
c163c211
jeffra Merge branch 'master' into ds-llama
527cc236
jeffra Merge branch 'master' into ds-llama
2bf3290f
jeffra Merge branch 'master' into ds-llama
cbfd54de
jeffra
jeffra
HeyangQin
HeyangQin approved these changes on 2024-08-13
loadams
loadams loadams enabled auto-merge 1 year ago
loadams loadams merged 6e5d58d2 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone