vllm
[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing
#12501
Merged

[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing #12501

tjtanaa
kliuae Add ptpc-fp8 quantization
8c615443
tjtanaa Enable torch._scaled_mm rowwise gemm fp8
d86df2d2
tjtanaa Update PyTorch version in Dockerfile.rocm_base; Update AMD GPU instal…
4d578810
tjtanaa tjtanaa requested a review from mgoin mgoin 326 days ago
tjtanaa tjtanaa requested a review from robertgshaw2-redhat robertgshaw2-redhat 326 days ago
tjtanaa tjtanaa requested a review from tlrmchlsmth tlrmchlsmth 326 days ago
github-actions
mergify mergify added documentation
mergify mergify added ci/build
tjtanaa add ptpc fp8 unittests
ef98cef6
mgoin
mgoin commented on 2025-01-28
hongxiayang hongxiayang added rocm
tjtanaa
mgoin
mgoin commented on 2025-01-31
tjtanaa fix test_fp8.py::test_kv_cache_model_load_and_run; remove unnecessary…
0f309c2f
tjtanaa tjtanaa changed the title [ROCm] [Feature] [Doc] [Dockerfile] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing [ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing 322 days ago
tjtanaa Merge remote-tracking branch 'origin/main' into ptpc-fp8-rocm-2
004dadbe
tjtanaa format lint code
73d7bd10
tjtanaa
mgoin mgoin added ready
mgoin
mgoin approved these changes on 2025-02-07
tjtanaa Merge remote-tracking branch 'origin/main' into ptpc-fp8-rocm-2
12f42de8
tjtanaa introduce USE_ROWWISE_TORCH_SCALED_MM
881ce38d
DarkLight1337 DarkLight1337 enabled auto-merge (squash) 316 days ago
disabled auto-merge 316 days ago
Manually disabled by user
DarkLight1337 DarkLight1337 enabled auto-merge (squash) 316 days ago
simon-mo simon-mo merged eaa92d44 into main 316 days ago
tjtanaa tjtanaa deleted the ptpc-fp8-rocm-2 branch 39 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone