onnxruntime
Fix cuda memory access violation in GQA FlashAttention
#24447
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
Fix cuda memory access violation in GQA FlashAttention
#24447
RyanUnderhill
merged 1 commit into
main
from
ryanunderhill/flashattention_crash_fix
zeros_ buffer was uninitialized so wasn't always zeros. This would le…
9d4c6bb0
RyanUnderhill
requested a review
from
aciddelgado
301 days ago
aciddelgado
approved these changes on 2025-04-16
baijumeswani
approved these changes on 2025-04-16
baijumeswani
commented on 2025-04-16
tianleiwu
changed the title
Fix cuda memory access violation in FlashAttention
Fix cuda memory access violation in GQA FlashAttention
301 days ago
RyanUnderhill
merged
99f2b806
into main
301 days ago
RyanUnderhill
deleted the ryanunderhill/flashattention_crash_fix branch
301 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
baijumeswani
aciddelgado
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub