vllm
Remove unused kwargs from model definitions
#13555
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
40
Changes
View On
GitHub
Remove unused kwargs from model definitions
#13555
simon-mo
merged 40 commits into
vllm-project:main
from
hmellor:remove-unused-attn-args
Remove `kv_cache` and `attn_metadata` from `Attention`
28c7f270
Remove `attn_metadata` from `MambaMixer` 1 & 2
1fe2b0db
Remove `kv_caches` and `attn_metadata` from `forward` call
153d253f
Remove `kv_caches` and `attn_metadata` from new model docs
eb30940a
Remove `kv_caches` and `attn_metadata` from model interface
7a757531
Remove args from a batch of models
7ddfd1fb
mergify
added
documentation
Remove args from another batch of models
f8794e9d
hmellor
added
ready
hmellor
marked this pull request as ready for review
356 days ago
Remove `attn_metadata` from a couple more places
f81cad0e
Attempt fix HPU model runner
6beb1b14
Update CPU model runners
c7840700
Update V1 GPU model runner
72450ae1
hmellor
requested a review
from
WoosukKwon
356 days ago
hmellor
requested a review
from
robertgshaw2-redhat
356 days ago
hmellor
requested a review
from
njhill
356 days ago
hmellor
requested a review
from
ywang96
356 days ago
hmellor
requested a review
from
comaniac
356 days ago
hmellor
requested a review
from
alexm-redhat
356 days ago
mergify
added
v1
DarkLight1337
requested a review
from
youkaichao
356 days ago
Update draft model runner
fdda9c6a
Update enc dec model runner
f9a1ee8a
Update remaining non-device model runners
b91538a9
mergify
added
speculative-decoding
Allow `kv_caches` to be passed to `execute_model`
59f01be8
Update XPU model runner
778910f5
Update V1 GPU model runner
c7cd8522
Update OpenVINO model runner
334d2b37
Update Neuron model runner
0735ed90
Add unused `kv_caches` arg to runners to limit scope of PR
5a8a73d0
Update TPU V0 and V1
3b9a35b6
Update HPU model runner
bb094d23
Make `kv_caches` optional in `HPUModelRunner.execute_model`
46d8fabd
Make linter happy
39ad6d44
Fix whisper test
164ee323
Add `kv_caches` back to remaining `*ModelRunner.execute_model()`
f6c8e2a0
Fix kernel tests
c917880d
hmellor
requested a review
from
tlrmchlsmth
355 days ago
Kick CI
6a296980
Merge branch 'main' into remove-unused-attn-args
cd1e8452
Fix missing import
f8b4d362
Fix call to `execute_model` in encoder decoder model runner
39742a3a
Fix call to `execute_model` in XPU model runner
cc087b05
Fix call to `execute_model` in multi-step model runner
6f703ba5
Fix V1 TPU model runner
d0ee4313
Fix multi-step model runner
29cff77f
Merge branch 'main' into remove-unused-attn-args
7e0c8083
comaniac
approved these changes on 2025-02-21
hmellor
closed this
354 days ago
hmellor
reopened this
354 days ago
Deprecate args in `Attention.forward` instead
5d84b992
youkaichao
approved these changes on 2025-02-22
youkaichao
commented on 2025-02-22
Revert "Deprecate args in `Attention.forward` instead"
8925e30a
youkaichao
commented on 2025-02-22
heheda12345
approved these changes on 2025-02-23
Merge branch 'main' into remove-unused-attn-args
b7ec2d91
Fix `mllama` KV cache access
a775d1ce
simon-mo
merged
cdc1fa12
into main
350 days ago
hmellor
deleted the remove-unused-attn-args branch
350 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
heheda12345
youkaichao
comaniac
WoosukKwon
robertgshaw2-redhat
njhill
ywang96
alexm-redhat
tlrmchlsmth
Assignees
No one assigned
Labels
documentation
speculative-decoding
ready
v1
Milestone
No milestone
Login to write a write a comment.
Login via GitHub