text-generation-inference
38cff84a
- feat: support flash attention 2 in qwen2 vl vision blocks (#2721)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
feat: support flash attention 2 in qwen2 vl vision blocks (#2721) * feat: support flash attention 2 in qwen2 vl vision blocks * fix: calc max_seqlen once and small refactors
References
#2721 - feat: support flash attention 2 in qwen2 vl vision blocks
Author
drbh
Parents
3c9df21f
Loading