DeepSpeed
Fix fp8 gemm
#7265
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
12
Changes
View On
GitHub
Fix fp8 gemm
#7265
loadams
merged 12 commits into
deepspeedai:master
from
RezaYazdaniAminabadi:fix-fp8-gemm
Optimize the fp-dequantizer to get high memory-BW utilization
10bad7de
fix formating
c5ba68e3
Merge branch 'master' into master
9975f753
Merge branch 'microsoft:master' into master
f950f722
Merge branch 'microsoft:master' into master
6381aaec
Merge branch 'deepspeedai:master' into master
8a0f1d5e
test
85e533fa
fix the fp8-gemm by removing prefetching from bf16 conversion (New Tr…
01a24d15
RezaYazdaniAminabadi
requested a review
from
tohtana
308 days ago
RezaYazdaniAminabadi
requested a review
from
tjruwase
308 days ago
formatting
e31f87f8
jeffra
approved these changes on 2025-04-30
sfc-gh-mwyatt
commented on 2025-04-30
Update deepspeed/ops/fp_quantizer/quantize.py
c4a90677
Update fp_quantizer.py
f6da0b58
sfc-gh-mwyatt
requested a review
from
loadams
308 days ago
sfc-gh-mwyatt
requested a review
from
jomayeri
308 days ago
Merge branch 'master' into fix-fp8-gemm
931c69c5
loadams
approved these changes on 2025-05-08
loadams
merged
069ec31c
into master
300 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
loadams
jeffra
sfc-gh-mwyatt
tohtana
tjruwase
jomayeri
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub