DeepSpeed
1ec42552
- triton supports the flash attention when compute cap > 8.0
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
triton supports the flash attention when compute cap > 8.0
References
#4337 - adds triton flash attention2 kernel
Author
styoun
Parents
b6c47f6b
Loading