vllm
[Spec Decode] (1/2) Remove batch expansion
#8839
Merged

[Spec Decode] (1/2) Remove batch expansion #8839

LiuXiaoxuanPKU
LiuXiaoxuanPKU draft
c4c5dabf
LiuXiaoxuanPKU w/o cuda graph support
cb08091e
LiuXiaoxuanPKU args and tests
8c10b115
LiuXiaoxuanPKU disable mqa for ngram and format
44930fba
LiuXiaoxuanPKU clean up and tests
e64c61b5
LiuXiaoxuanPKU revert example
541b7674
LiuXiaoxuanPKU minor
b6c1de3c
LiuXiaoxuanPKU minor
5824b78f
github-actions
LiuXiaoxuanPKU LiuXiaoxuanPKU changed the title [Spec Decode] (1/2) Remove batch expansion w/o cuda graph [Spec Decode] (1/2) Remove batch expansion 1 year ago
LiuXiaoxuanPKU LiuXiaoxuanPKU added ready
LiuXiaoxuanPKU LiuXiaoxuanPKU requested a review from njhill njhill 1 year ago
LiuXiaoxuanPKU LiuXiaoxuanPKU requested a review from comaniac comaniac 1 year ago
comaniac
comaniac commented on 2024-09-26
LiuXiaoxuanPKU fix tests -- chunked prefill and hiddens states in spec dec
07aebc07
LiuXiaoxuanPKU fix
d6cb1cc6
LiuXiaoxuanPKU minor
bcc1fe95
LiuXiaoxuanPKU fix
b036d062
LiuXiaoxuanPKU Merge branch 'main' into remove_batch_expansion
b93694d9
sroy745
sroy745
sroy745 commented on 2024-09-27
LiuXiaoxuanPKU fix sampler for beam search
35750a60
LiuXiaoxuanPKU revert num compute tokens
741068ae
LiuXiaoxuanPKU disbale mqa scorer when draft model and target model have different m…
71be3402
LiuXiaoxuanPKU diable mqa for cuda graph
cff6b0fd
LiuXiaoxuanPKU fix partial comments
f4fb00b9
LiuXiaoxuanPKU fix comments
b3e86910
LiuXiaoxuanPKU fix sampler and spec dec tests
238e5a06
LiuXiaoxuanPKU remove backend
5063c95d
LiuXiaoxuanPKU LiuXiaoxuanPKU force pushed to 5063c95d 1 year ago
LiuXiaoxuanPKU more test fix
70662b04
LiuXiaoxuanPKU Merge branch 'main' into remove_batch_expansion
878d2da4
LiuXiaoxuanPKU fix num_compute_token
0e32744b
LiuXiaoxuanPKU
LiuXiaoxuanPKU clean up
7ee29986
comaniac
comaniac approved these changes on 2024-10-01
LiuXiaoxuanPKU more fix for num_compute_token
d39c8a93
LiuXiaoxuanPKU change log condition
79ac29ce
LiuXiaoxuanPKU
LiuXiaoxuanPKU add comments
6f3388b5
LiuXiaoxuanPKU query len for multi-step, specify ci backend
14253322
LiuXiaoxuanPKU fix ci
e5702a90
LiuXiaoxuanPKU fix
8e276648
LiuXiaoxuanPKU format
3f3c2228
varun-sundar-rabindranath
varun-sundar-rabindranath commented on 2024-10-01
varun-sundar-rabindranath
varun-sundar-rabindranath commented on 2024-10-01
LiuXiaoxuanPKU context_len for multi-step and encoder decoder, fix decode_len
27074227
LiuXiaoxuanPKU LiuXiaoxuanPKU merged 15702038 into main 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone