text-generation-inference
38cff84a - feat: support flash attention 2 in qwen2 vl vision blocks (#2721)

Commit
1 year ago
feat: support flash attention 2 in qwen2 vl vision blocks (#2721) * feat: support flash attention 2 in qwen2 vl vision blocks * fix: calc max_seqlen once and small refactors
Author
Parents
Loading