vllm
[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments
#26315
Merged

[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments #26315

MatthewBonanni
MatthewBonanni MatthewBonanni requested a review from simon-mo simon-mo 91 days ago
MatthewBonanni MatthewBonanni requested a review from WoosukKwon WoosukKwon 91 days ago
MatthewBonanni MatthewBonanni requested a review from youkaichao youkaichao 91 days ago
MatthewBonanni MatthewBonanni requested a review from robertgshaw2-redhat robertgshaw2-redhat 91 days ago
MatthewBonanni MatthewBonanni requested a review from mgoin mgoin 91 days ago
MatthewBonanni MatthewBonanni requested a review from tlrmchlsmth tlrmchlsmth 91 days ago
MatthewBonanni MatthewBonanni requested a review from houseroad houseroad 91 days ago
MatthewBonanni MatthewBonanni requested a review from hmellor hmellor 91 days ago
MatthewBonanni MatthewBonanni requested a review from yewentao256 yewentao256 91 days ago
MatthewBonanni MatthewBonanni requested a review from ProExpertProg ProExpertProg 91 days ago
gemini-code-assist
gemini-code-assist commented on 2025-10-06
chatgpt-codex-connector
chatgpt-codex-connector commented on 2025-10-06
MatthewBonanni MatthewBonanni requested a review from gshtras gshtras 91 days ago
MatthewBonanni MatthewBonanni requested a review from LucasWilkinson LucasWilkinson 91 days ago
mergify mergify added v1
MatthewBonanni MatthewBonanni requested a review from jikunshang jikunshang 91 days ago
mergify mergify added rocm
mergify
mergify mergify added needs-rebase
hmellor
hmellor
hmellor requested changes on 2025-10-09
mergify mergify added nvidia
ProExpertProg
MatthewBonanni
MatthewBonanni Add AttentionConfig and --attention-backend CLI argument
4bc2ee95
MatthewBonanni Add documentation for environment variable compatibility
cc130f5f
MatthewBonanni Add deprecation warnings for attention environment variables
a6161f54
MatthewBonanni Limit deprecation warning to VLLM_ATTENTION_BACKEND only
13727f39
MatthewBonanni Replace attention env var references with AttentionConfig
cd720ddd
MatthewBonanni Replace remaining VLLM_ATTENTION_BACKEND env var usages
3cd41b73
MatthewBonanni Remove forced_attn_backend global, use AttentionConfig directly
92c15d16
MatthewBonanni Replace remaining attention env var references with AttentionConfig
bdeab7ab
MatthewBonanni Apply suggestion from @gemini-code-assist[bot]
8af62383
MatthewBonanni MatthewBonanni force pushed from 52faefd4 to 8af62383 46 days ago
MatthewBonanni MatthewBonanni requested a review from pavanimajety pavanimajety 46 days ago
mergify mergify removed needs-rebase
MatthewBonanni fix pre-commit
f605638f
MatthewBonanni use get_kwargs
ebadc2c3
MatthewBonanni handle envs differently
30bc9e39
MatthewBonanni backend handled elsewhere
96c914cc
MatthewBonanni
MatthewBonanni remove old comments
53db82fb
MatthewBonanni remove old comments
92c673b9
MatthewBonanni change comment back
af5148ef
MatthewBonanni update hash
22ba47cd
hmellor
hmellor commented on 2025-11-21
MatthewBonanni remove comment
b7bc45e9
MatthewBonanni Update vllm/config/attention.py
5b296dc4
MatthewBonanni comment
9f7522e3
MatthewBonanni move to __post_init__
592b6438
hmellor
hmellor commented on 2025-11-21
hmellor
hmellor commented on 2025-11-21
MatthewBonanni remove unnecessary cache
2d017ae0
MatthewBonanni Update vllm/engine/arg_utils.py
5c2aa2da
MatthewBonanni Update vllm/engine/arg_utils.py
5b6f35dc
MatthewBonanni use enum instead of str
eb25b97b
MatthewBonanni Update vllm/config/attention.py
395a9ba0
MatthewBonanni convert str to enum
c0e7fe39
MatthewBonanni Update vllm/attention/utils/fa_utils.py
93090802
MatthewBonanni handle everything in attention config
e08f21fe
MatthewBonanni simplify
be4395ec
MatthewBonanni Merge branch 'main' into add-attention-config
46aa50f0
MatthewBonanni
MatthewBonanni MatthewBonanni requested a review from hmellor hmellor 45 days ago
MatthewBonanni remove --attention-backend
178e903f
MatthewBonanni update comment
aa39faeb
MatthewBonanni use config
4ba23673
MatthewBonanni MatthewBonanni requested a review from tjtanaa tjtanaa 45 days ago
MatthewBonanni clean up env variable usage
2a42d293
MatthewBonanni MatthewBonanni requested a review from DarkLight1337 DarkLight1337 45 days ago
MatthewBonanni MatthewBonanni requested a review from ywang96 ywang96 45 days ago
MatthewBonanni clean up env variable usage
affffe17
MatthewBonanni MatthewBonanni changed the title [Attention][UX] Add AttentionConfig and change attention backend to CLI argument [Attention][UX][1/N] Add AttentionConfig and change attention backend to CLI argument 45 days ago
MatthewBonanni Merge branch 'main' into add-attention-config
2746c5c0
MatthewBonanni MatthewBonanni changed the title [Attention][UX][1/N] Add AttentionConfig and change attention backend to CLI argument [Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments 45 days ago
MatthewBonanni Add --attention-backend back
0495e61c
mergify
mergify mergify added needs-rebase
hmellor
hmellor commented on 2025-11-24
MatthewBonanni Update vllm/config/attention.py
e60bdc52
MatthewBonanni Update vllm/platforms/cuda.py
ac28e3be
MatthewBonanni WIP: 4d01b64284 [Bugfix] - Add Trace Headers to Beam Search Path (#29…
c0646e11
mergify mergify removed needs-rebase
hmellor
hmellor approved these changes on 2025-11-24
hmellor hmellor enabled auto-merge (squash) 42 days ago
github-actions github-actions added ready
mergify
mergify mergify added needs-rebase
MatthewBonanni Merge branch 'main' into add-attention-config
6ca97a9e
disabled auto-merge 42 days ago
Head branch was pushed to by a user without write access
mergify mergify removed needs-rebase
mgoin
mgoin commented on 2025-11-24
MatthewBonanni Rename flashinfer_disable_q_quantization to disable_flashinfer_q_quan…
10bd4315
MatthewBonanni Merge branch 'main' into add-attention-config
d9347bd2
MatthewBonanni lazy import
49051f06
MatthewBonanni fix missed reference
9f981113
MatthewBonanni typo
40e2a647
MatthewBonanni rebuild dataclass
d0d910b0
MatthewBonanni Revert "rebuild dataclass"
614a0a1f
MatthewBonanni undo TYPE_CHECKING import
ef70b675
ProExpertProg
ProExpertProg commented on 2025-11-25
mergify
mergify mergify added needs-rebase
MatthewBonanni Merge branch 'main' into add-attention-config
bd0a5786
mergify mergify removed needs-rebase
MatthewBonanni add -ac shorthand
40b60d9f
MatthewBonanni update test to use AttentionConfig directly
941be4f5
MatthewBonanni
MatthewBonanni Merge branch 'main' into add-attention-config
be696453
MatthewBonanni validate during __post_init__ (env variable handling)
c93d923c
MatthewBonanni fix import
d3016e9d
MatthewBonanni update test
bc44febe
mergify
mergify mergify added needs-rebase
tjtanaa
tjtanaa commented on 2025-11-29
MatthewBonanni Merge branch 'main' into add-attention-config
1fe94ff1
mergify mergify removed needs-rebase
MatthewBonanni Merge branch 'main' into add-attention-config
48f67953
MatthewBonanni Merge branch 'main' into add-attention-config
6c1d59b7
MatthewBonanni Merge branch 'main' into add-attention-config
154c466e
MatthewBonanni fix cache behavior
86282173
MatthewBonanni Merge branch 'main' into add-attention-config
8d78f1f0
MatthewBonanni Fix test_nixl_connector
a1bc043b
MatthewBonanni MatthewBonanni requested a review from ApostaC ApostaC 34 days ago
mergify mergify added kv-connector
MatthewBonanni
MatthewBonanni Merge branch 'main' into add-attention-config
ce60e37e
MatthewBonanni Fix Blackwell compile test
132bd749
MatthewBonanni MatthewBonanni requested a review from tdoublep tdoublep 33 days ago
MatthewBonanni MatthewBonanni requested a review from zhuohan123 zhuohan123 33 days ago
MatthewBonanni MatthewBonanni requested a review from alexm-redhat alexm-redhat 33 days ago
MatthewBonanni MatthewBonanni requested a review from njhill njhill 33 days ago
MatthewBonanni Merge branch 'main' into add-attention-config
ecf68f26
MatthewBonanni Merge branch 'main' into add-attention-config
d20c9fcb
MatthewBonanni update version to allow 1 release
d0abdba9
MatthewBonanni Fix test_attention_backends
39d65a69
MatthewBonanni fix test_gpu_model_runner
e8c7e780
hmellor
hmellor commented on 2025-12-04
MatthewBonanni Fix FlashInferMetadataBuilder.__init__ and revert "Fix test_attention…
a971d7c2
LucasWilkinson Merge branch 'main' into add-attention-config
99544358
LucasWilkinson LucasWilkinson added ready-run-all-tests
vllm-bot vllm-bot merged 66e674cd into main 31 days ago
MatthewBonanni MatthewBonanni deleted the add-attention-config branch 31 days ago
wangshangsam
nvpohanh

Login to write a write a comment.

Login via GitHub