text-generation-inference
d9fbbaaf - Tied embeddings in MLP speculator. (#2473)

Commit
1 year ago
Tied embeddings in MLP speculator. (#2473) * Tied embeddings in MLP speculator. * Fixing the scale_weight when users decide to not use the speculation as much as defined in the config. * Adding scaling support + optimize some ops.
Author
Parents
Loading