vllm
[v1][core] Support for attention free models
#20811

Merged

[v1][core] Support for attention free models #20811

heheda12345 merged 10 commits into vllm-project:main from christian-pinto:attention_free_models_support

christian-pinto requested a review from

WoosukKwon 159 days ago

christian-pinto requested a review from

robertgshaw2-redhat 159 days ago

christian-pinto requested a review from

njhill 159 days ago

christian-pinto requested a review from

ywang96 159 days ago

christian-pinto requested a review from

comaniac 159 days ago

christian-pinto requested a review from

alexm-redhat 159 days ago

gemini-code-assist commented on 2025-07-11

mergify added v1

gemini-code-assist commented on 2025-07-11

christian-pinto commented on 2025-07-11

christian-pinto force pushed 159 days ago

heheda12345 commented on 2025-07-11

christian-pinto force pushed 156 days ago

Support for attention free models

b764c9dd

is_kv_cache_type_attention_free: return False if not attention free

5825ba45

some minor edits after first review round

fc86350b

Rebase to current master

97c11e62

christian-pinto force pushed to 97c11e62 156 days ago

Make pre-commits pass

673aeb06

heheda12345 commented on 2025-07-14

christian-pinto requested a review from

simon-mo 155 days ago

christian-pinto requested a review from

youkaichao 155 days ago

christian-pinto requested a review from

mgoin 155 days ago

christian-pinto requested a review from

tlrmchlsmth 155 days ago

christian-pinto requested a review from

houseroad 155 days ago

christian-pinto requested a review from

hmellor 155 days ago

Disable chunk prefill and prefix caching when model is attention free

fb3ecfbc

christian-pinto force pushed to fb3ecfbc 155 days ago

reworked to allow for models like mamba to use the kv_cache for state…

8e5dbee2

cleanup config.py

2ee7087c

cleanup gpu_worker.py

19a7d708

heheda12345 commented on 2025-07-15

Edits after review

b8f355e8

heheda12345 approved these changes on 2025-07-15

heheda12345 changed the title ~~[v1][core]Support for attention free models~~ [v1][core] Support for attention free models 155 days ago

heheda12345 enabled auto-merge (squash) 155 days ago

github-actions added ready

heheda12345 merged 4ffd963f into main 154 days ago

christian-pinto deleted the attention_free_models_support branch 153 days ago

Reviewers

heheda12345

gemini-code-assist

WoosukKwon

robertgshaw2-redhat

njhill

ywang96

comaniac

alexm-redhat

simon-mo

youkaichao

mgoin

tlrmchlsmth

houseroad

hmellor

Assignees

No one assigned

Labels

ready v1

Milestone

No milestone

vllm [v1][core] Support for attention free models #20811 Merged

[v1][core] Support for attention free models #20811

vllm
[v1][core] Support for attention free models
#20811

Merged