DeepSpeed
Add fp8-fused gemm kernel
#5764
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
16
Changes
View On
GitHub
Add fp8-fused gemm kernel
#5764
loadams
merged 16 commits into
deepspeedai:master
from
Snowflake-Labs:add-fp8-gemm
Add fp8-fused gemm kernel
4c3b8fd5
add get_scale function
c0e97f1c
sfc-gh-reyazda
requested a review
from
awan-10
1 year ago
sfc-gh-reyazda
requested a review
from
arashb
1 year ago
sfc-gh-reyazda
requested a review
from
tjruwase
1 year ago
sfc-gh-reyazda
requested a review
from
loadams
1 year ago
HeyangQin
approved these changes on 2024-07-11
fix a few things to run the test
cb0e0a65
Merge branch 'master' into add-fp8-gemm
a11f9c5e
fixes for optim linear
4169b137
fix illegal memory corner cases with an extra condition for reading s…
2f82f2b0
reduce memory pressure
e489c563
Merge branch 'master' into add-fp8-gemm
32257963
add version check to fp quant op builder
c7e06ddc
small fix for fp16 quantization
82d2a47a
fix formatting issues
dc1ab2e4
only import matmul_fp8 if triton is available
0dbddc24
delay import in test until after skip check
ab4d4729
allow 2.3.0 and 2.3.1
67d7aa9c
Merge branch 'master' into add-fp8-gemm
8a62ca40
fix issue with workflows that don't have pkg_version
86209003
loadams
enabled auto-merge
1 year ago
disabled auto-merge
1 year ago
Manually disabled by user
loadams
merged
4f950672
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
HeyangQin
awan-10
arashb
tjruwase
loadams
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub