text-generation-inference
5fca30ee
- fix(l4): fix fp8 logic on l4 (#2277)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
fix(l4): fix fp8 logic on l4 (#2277) * fix(l4): fix fp8 logic on l4 * also quant weights with single scale * use marlin even on 89
References
#2277 - fix(l4): fix fp8 logic on l4
Author
OlivierDehaene
Parents
abc32537
Loading