text-generation-inference
5fca30ee - fix(l4): fix fp8 logic on l4 (#2277)

Commit
1 year ago
fix(l4): fix fp8 logic on l4 (#2277) * fix(l4): fix fp8 logic on l4 * also quant weights with single scale * use marlin even on 89
Parents
Loading