vllm
7faf51f1
- [Bugfix] Re-enable prefill of max model length (#24446)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
205 days ago
[Bugfix] Re-enable prefill of max model length (#24446) Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
References
#25293 - [Refactor] Refactor FP8 & INT8 Quant Folder inside `w8a8`
Author
yannicks1
Committer
yewentao256
Parents
ff1daf6c
Loading