vllm
[v1][core] Support for attention free models
#20811
Merged

[v1][core] Support for attention free models #20811

christian-pinto
christian-pinto christian-pinto requested a review from WoosukKwon WoosukKwon 159 days ago
christian-pinto christian-pinto requested a review from robertgshaw2-redhat robertgshaw2-redhat 159 days ago
christian-pinto christian-pinto requested a review from njhill njhill 159 days ago
christian-pinto christian-pinto requested a review from ywang96 ywang96 159 days ago
christian-pinto christian-pinto requested a review from comaniac comaniac 159 days ago
christian-pinto christian-pinto requested a review from alexm-redhat alexm-redhat 159 days ago
gemini-code-assist
gemini-code-assist commented on 2025-07-11
mergify mergify added v1
gemini-code-assist
gemini-code-assist commented on 2025-07-11
christian-pinto
christian-pinto commented on 2025-07-11
christian-pinto christian-pinto force pushed 159 days ago
github-actions
heheda12345
heheda12345 commented on 2025-07-11
heheda12345
heheda12345
christian-pinto christian-pinto force pushed 156 days ago
christian-pinto
christian-pinto Support for attention free models
b764c9dd
christian-pinto is_kv_cache_type_attention_free: return False if not attention free
5825ba45
christian-pinto some minor edits after first review round
fc86350b
christian-pinto Rebase to current master
97c11e62
christian-pinto christian-pinto force pushed to 97c11e62 156 days ago
christian-pinto Make pre-commits pass
673aeb06
heheda12345
heheda12345
heheda12345 commented on 2025-07-14
maxdebayser
christian-pinto
christian-pinto christian-pinto requested a review from simon-mo simon-mo 155 days ago
christian-pinto christian-pinto requested a review from youkaichao youkaichao 155 days ago
christian-pinto christian-pinto requested a review from mgoin mgoin 155 days ago
christian-pinto christian-pinto requested a review from tlrmchlsmth tlrmchlsmth 155 days ago
christian-pinto christian-pinto requested a review from houseroad houseroad 155 days ago
christian-pinto christian-pinto requested a review from hmellor hmellor 155 days ago
christian-pinto Disable chunk prefill and prefix caching when model is attention free
fb3ecfbc
christian-pinto christian-pinto force pushed to fb3ecfbc 155 days ago
christian-pinto
christian-pinto reworked to allow for models like mamba to use the kv_cache for state…
8e5dbee2
christian-pinto cleanup config.py
2ee7087c
christian-pinto cleanup gpu_worker.py
19a7d708
heheda12345
heheda12345 commented on 2025-07-15
christian-pinto Edits after review
b8f355e8
heheda12345
heheda12345 approved these changes on 2025-07-15
heheda12345 heheda12345 changed the title [v1][core]Support for attention free models [v1][core] Support for attention free models 155 days ago
heheda12345 heheda12345 enabled auto-merge (squash) 155 days ago
github-actions github-actions added ready
christian-pinto
heheda12345 heheda12345 merged 4ffd963f into main 154 days ago
christian-pinto christian-pinto deleted the attention_free_models_support branch 153 days ago

Login to write a write a comment.

Login via GitHub