transformers
1f33023c - Flash-attn performance: remove cuda sync during inference (#33570)

Commit
1 year ago
Flash-attn performance: remove cuda sync during inference (#33570) Switch conditions to use short-circuit during inference
Author
Parents
Loading