DeepSpeed
2466fd9d - packed flash attn with mask works

Commit
2 years ago
packed flash attn with mask works
Author
Parents
Loading