vllm
4adc66f6
- [Bugfix] Allocate less memory in non-batched CUTLASS MoE (#21121)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
156 days ago
[Bugfix] Allocate less memory in non-batched CUTLASS MoE (#21121) Signed-off-by: ElizaWszola <ewszola@redhat.com>
References
#21121 - [Bugfix] Allocate less memory in non-batched CUTLASS MoE
Author
ElizaWszola
Parents
55ad6487
Loading