transformers
07e38311 - [CB] Changes for long generation (#45530)

Commit
29 days ago
[CB] Changes for long generation (#45530) * Fix KV dedup for decode batches * Fix memory estimation * Change default * Added write-only fast path * Take both peaks into account * Revert unused config field * Review 1 * Fix p1s * Fix p2s and p3s that needed it * Added a TODO * Fix test, lower max cached graph, add TODO * Fix fragmentation with big warmup * Add more space for logits processors * Fix
Author
Parents
Loading