vllm
[FP8][Kernel] Dynamic kv cache scaling factors computation
#11906
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
24
Changes
View On
GitHub
[FP8][Kernel] Dynamic kv cache scaling factors computation
#11906
mgoin
merged 24 commits into
vllm-project:main
from
ROCm:kv_cache_dynamic_computation_upstream
Dynamic Scale Factor Calculations for Key/Value Scales With FP8 KV Ca…
b302bfb1
Mllama kv scale fix (#335)
ef181a9d
format
f9645bff
Fix for different attention types
6ad050e0
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
64668c62
Properly initializing the new field in the attn metadata (#337)
3eaca59b
Upstream doesn't have fp8 navi support yet
3a18c313
Cannot reference tensor contents during graph capture
390bdaa2
Adjusted cpu implementation datatypes
681ceb34
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
9721ece8
gshtras
requested a review
from
tlrmchlsmth
323 days ago
gshtras
requested a review
from
WoosukKwon
323 days ago
gshtras
requested a review
from
DarkLight1337
323 days ago
gshtras
requested a review
from
ywang96
323 days ago
gshtras
requested a review
from
robertgshaw2-redhat
323 days ago
gshtras
requested a review
from
njhill
323 days ago
gshtras
requested a review
from
comaniac
323 days ago
gshtras
requested a review
from
alexm-redhat
323 days ago
gshtras
requested a review
from
zhuohan123
323 days ago
gshtras
requested a review
from
youkaichao
323 days ago
mergify
added
documentation
mgoin
requested a review
from
mgoin
319 days ago
mergify
added
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
5f95f8a1
mergify
removed
needs-rebase
mgoin
commented on 2025-01-14
mergify
added
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
33c79c9b
Adjust API to the latest attention revert
31e0041e
mergify
removed
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
d5f228e8
mgoin
added
quantization
mgoin
added
ready
hongxiayang
added
rocm
Fix kv scales value in the tests
5bc162c1
mgoin
enabled auto-merge (squash)
311 days ago
Add scales to test prefix prefill
84573bf0
disabled auto-merge
311 days ago
Head branch was pushed to by a user without write access
Preserving original float scale values to pas to the attention backen…
1ac64b08
Return assertions now that we have float values to use
d9edda42
Using tensors in test_reshape_and_cache_flash
6efd98ff
Using correct dtype for scales in the test
79811daf
mergify
added
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
ee8db019
mergify
removed
needs-rebase
mergify
added
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
38190485
mergify
removed
needs-rebase
Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
2c55757c
Move kv scales documentation into the new file
55ef92cb
gshtras
requested a review
from
mgoin
309 days ago
mgoin
approved these changes on 2025-01-23
mgoin
enabled auto-merge (squash)
309 days ago
mgoin
merged
e97f802b
into main
309 days ago
gshtras
deleted the kv_cache_dynamic_computation_upstream branch
309 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
tlrmchlsmth
WoosukKwon
DarkLight1337
ywang96
robertgshaw2-redhat
njhill
comaniac
alexm-redhat
zhuohan123
youkaichao
Assignees
No one assigned
Labels
documentation
rocm
quantization
ready
Milestone
No milestone
Login to write a write a comment.
Login via GitHub