onnxruntime
a2c62832 - Fix Packed MultiHead Attention (#17996)

Commit
2 years ago
Fix Packed MultiHead Attention (#17996) ### Description Initialize previously unitialized parameters that were causing Op to crash. ### Motivation and Context Solves Cuda Memory Misalignment / Illegal Memory Access error when FlashAttention was used in Packed Multi-Head Attention.
Author
Parents
Loading