DeepSpeed
adds triton flash attention2 kernel
#4337
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
Commits
initial commit
styoun
committed
2 years ago
temp commit: needs debugging
styoun
committed
2 years ago
packed flash attn with mask works
styoun
committed
2 years ago
clean-up
styoun
committed
2 years ago
add bert/roberta tests to test_inference
styoun
committed
2 years ago
is_triton_supported added to Accelerator class
styoun
committed
2 years ago
triton supports the flash attention when compute cap > 8.0
styoun
committed
2 years ago
formatting
styoun
committed
2 years ago
fix comments
styoun
committed
2 years ago
cleanup
styoun
committed
2 years ago
cleanup flash kernel
styoun
committed
2 years ago
Merge branch 'master' into styoun/triton-flash2
styoun
committed
2 years ago
fix according to the PR comment
styoun
committed
2 years ago
Merge branch 'master' into styoun/triton-flash2
lekurile
committed
2 years ago
Loading