text-generation-inference
153fcf77
- Fix incorrect cache allocation with multi-query (#2203)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
Fix incorrect cache allocation with multi-query (#2203) We wouldn't allocate any memory in multi-query (1 KV head). Fixes Starcoder et al.
Author
danieldk
Parents
cce475a9
Loading