vllm
[FP8][Kernel] Dynamic kv cache scaling factors computation
#11906
Merged

[FP8][Kernel] Dynamic kv cache scaling factors computation #11906

gshtras
micah-wil Dynamic Scale Factor Calculations for Key/Value Scales With FP8 KV Ca…
b302bfb1
gshtras Mllama kv scale fix (#335)
ef181a9d
gshtras format
f9645bff
gshtras Fix for different attention types
6ad050e0
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
64668c62
gshtras Properly initializing the new field in the attn metadata (#337)
3eaca59b
gshtras Upstream doesn't have fp8 navi support yet
3a18c313
gshtras Cannot reference tensor contents during graph capture
390bdaa2
gshtras Adjusted cpu implementation datatypes
681ceb34
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
9721ece8
gshtras gshtras requested a review from tlrmchlsmth tlrmchlsmth 323 days ago
gshtras gshtras requested a review from WoosukKwon WoosukKwon 323 days ago
gshtras gshtras requested a review from DarkLight1337 DarkLight1337 323 days ago
gshtras gshtras requested a review from ywang96 ywang96 323 days ago
gshtras gshtras requested a review from robertgshaw2-redhat robertgshaw2-redhat 323 days ago
gshtras gshtras requested a review from njhill njhill 323 days ago
gshtras gshtras requested a review from comaniac comaniac 323 days ago
gshtras gshtras requested a review from alexm-redhat alexm-redhat 323 days ago
gshtras gshtras requested a review from zhuohan123 zhuohan123 323 days ago
gshtras gshtras requested a review from youkaichao youkaichao 323 days ago
github-actions
mergify mergify added documentation
mgoin mgoin requested a review from mgoin mgoin 319 days ago
mergify
mergify mergify added needs-rebase
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
5f95f8a1
mergify mergify removed needs-rebase
mgoin
mgoin commented on 2025-01-14
mergify
mergify mergify added needs-rebase
gshtras
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
33c79c9b
gshtras Adjust API to the latest attention revert
31e0041e
mergify mergify removed needs-rebase
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
d5f228e8
mgoin mgoin added quantization
mgoin mgoin added ready
hongxiayang hongxiayang added rocm
mgoin
gshtras Fix kv scales value in the tests
5bc162c1
mgoin mgoin enabled auto-merge (squash) 311 days ago
gshtras Add scales to test prefix prefill
84573bf0
disabled auto-merge 311 days ago
Head branch was pushed to by a user without write access
gshtras Preserving original float scale values to pas to the attention backen…
1ac64b08
gshtras Return assertions now that we have float values to use
d9edda42
gshtras Using tensors in test_reshape_and_cache_flash
6efd98ff
gshtras Using correct dtype for scales in the test
79811daf
mergify
mergify mergify added needs-rebase
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
ee8db019
mergify mergify removed needs-rebase
mergify
mergify mergify added needs-rebase
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
38190485
mergify mergify removed needs-rebase
gshtras Merge remote-tracking branch 'origin/main' into kv_cache_dynamic_comp…
2c55757c
gshtras Move kv scales documentation into the new file
55ef92cb
gshtras gshtras requested a review from mgoin mgoin 309 days ago
mgoin
mgoin approved these changes on 2025-01-23
mgoin mgoin enabled auto-merge (squash) 309 days ago
mgoin mgoin merged e97f802b into main 309 days ago
gshtras gshtras deleted the kv_cache_dynamic_computation_upstream branch 309 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone