DeepSpeed
Add fp8-fused gemm kernel
#5764
Merged

Add fp8-fused gemm kernel #5764

sfc-gh-reyazda
sfc-gh-reyazda Add fp8-fused gemm kernel
4c3b8fd5
sfc-gh-reyazda add get_scale function
c0e97f1c
sfc-gh-reyazda sfc-gh-reyazda requested a review from awan-10 awan-10 1 year ago
sfc-gh-reyazda sfc-gh-reyazda requested a review from arashb arashb 1 year ago
sfc-gh-reyazda sfc-gh-reyazda requested a review from tjruwase tjruwase 1 year ago
sfc-gh-reyazda sfc-gh-reyazda requested a review from loadams loadams 1 year ago
HeyangQin
HeyangQin approved these changes on 2024-07-11
sfc-gh-reyazda fix a few things to run the test
cb0e0a65
sfc-gh-reyazda Merge branch 'master' into add-fp8-gemm
a11f9c5e
jeffra
jeffra fixes for optim linear
4169b137
HeyangQin
jeffra
sfc-gh-reyazda fix illegal memory corner cases with an extra condition for reading s…
2f82f2b0
sfc-gh-reyazda
sfc-gh-reyazda reduce memory pressure
e489c563
jeffra Merge branch 'master' into add-fp8-gemm
32257963
jeffra add version check to fp quant op builder
c7e06ddc
sfc-gh-reyazda small fix for fp16 quantization
82d2a47a
jeffra fix formatting issues
dc1ab2e4
jeffra only import matmul_fp8 if triton is available
0dbddc24
jeffra delay import in test until after skip check
ab4d4729
jeffra allow 2.3.0 and 2.3.1
67d7aa9c
jeffra Merge branch 'master' into add-fp8-gemm
8a62ca40
jeffra fix issue with workflows that don't have pkg_version
86209003
loadams loadams enabled auto-merge 1 year ago
disabled auto-merge 1 year ago
Manually disabled by user
loadams loadams merged 4f950672 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone