text-generation-inference
22fb1be5
- Fix cache block size for flash decoding (#2351)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
Fix cache block size for flash decoding (#2351) * Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends`
References
#2351 - Fix cache block size for flash decoding
Author
danieldk
Parents
9ab99374
Loading