onnxruntime
Attention(23) CUDA
#26466
Merged

Attention(23) CUDA #26466

titaiwangms merged 34 commits into main from titaiwang/support_attention_cuda
titaiwangms
titaiwangms refactor redundant condition checks
1e7d5ae1
titaiwangms sync to Xavier's cpu refactors
49d7a428
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
246a4d1f
titaiwangms fix attention-cpu build
f2449830
titaiwangms draft
8274bb12
titaiwangms lint - draft
78f5d618
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
53d4e834
titaiwangms fix typo
43623ad5
titaiwangms typo-2
08f15f6d
titaiwangms update namespace
0e754438
titaiwangms titaiwangms added ep:CUDA
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
277648d8
titaiwangms update doc
5253dd05
titaiwangms removed deprecated functions in onnx
4db63bcb
titaiwangms Revert "removed deprecated functions in onnx"
0a7e5f9d
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
6b18bb46
titaiwangms fix qkv space - support 3d default
a1ed3d9d
titaiwangms turn 4d to tru on disable cuda
b4629300
titaiwangms refactor attn_mask
0494e955
titaiwangms simplify
2dc706a5
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
5f0b6cd3
titaiwangms support 4d and fix attn_mask bug
88e631cd
titaiwangms disregard softcap and softmax_precision
000d394d
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
739e88fc
titaiwangms fix offset in is_causal
6d6d4782
titaiwangms add past_seq_length to softmax bias add for causal
792445a2
titaiwangms resolve merge conflict
a26d812c
titaiwangms update failing cuda tests
fbbf0b5c
titaiwangms titaiwangms marked this pull request as ready for review 83 days ago
titaiwangms titaiwangms requested a review from xadupre xadupre 83 days ago
titaiwangms titaiwangms requested a review from tianleiwu tianleiwu 83 days ago
tianleiwu
tianleiwu commented on 2025-12-17
tianleiwu
tianleiwu commented on 2025-12-17
tianleiwu
tianleiwu commented on 2025-12-17
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
70103087
titaiwangms delete past_sequence_length and use flag output_is_Q_K_V_BNSH
2c793b67
tianleiwu
tianleiwu commented on 2026-01-09
tianleiwu
tianleiwu commented on 2026-01-09
titaiwangms add kv_sequence_length to softmax
ab41d04c
titaiwangms titaiwangms requested a review from tianleiwu tianleiwu 56 days ago
xadupre
xadupre commented on 2026-01-13
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
4a8f502b
titaiwangms remove kv_sequence_length in softmax and disable cross attn causal tests
2a9167e0
titaiwangms titaiwangms added this to the 1.24.0 milestone 55 days ago
titaiwangms titaiwangms removed this from to the 1.24.0 milestone 55 days ago
tianleiwu tianleiwu changed the title Attenion(23) CUDA Attention(23) CUDA 55 days ago
tianleiwu
tianleiwu commented on 2026-01-14
tianleiwu
tianleiwu commented on 2026-01-14
tianleiwu
tianleiwu dismissed these changes on 2026-01-14
titaiwangms address reviews - comments
ff7e7673
titaiwangms Merge branch 'main' into titaiwang/support_attention_cuda
c0018852
titaiwangms titaiwangms dismissed their stale review via c0018852 54 days ago
titaiwangms titaiwangms added this to the 1.24.0 milestone 54 days ago
titaiwangms titaiwangms requested a review from xadupre xadupre 54 days ago
titaiwangms titaiwangms requested a review from tianleiwu tianleiwu 54 days ago
tianleiwu
tianleiwu approved these changes on 2026-01-14
titaiwangms titaiwangms enabled auto-merge (squash) 54 days ago
titaiwangms titaiwangms merged a3e477e0 into main 54 days ago
titaiwangms titaiwangms deleted the titaiwang/support_attention_cuda branch 54 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone