Attention(23) CUDA #26466
refactor redundant condition checks
1e7d5ae1
sync to Xavier's cpu refactors
49d7a428
Merge branch 'main' into titaiwang/support_attention_cuda
246a4d1f
fix attention-cpu build
f2449830
draft
8274bb12
lint - draft
78f5d618
Merge branch 'main' into titaiwang/support_attention_cuda
53d4e834
fix typo
43623ad5
typo-2
08f15f6d
update namespace
0e754438
Merge branch 'main' into titaiwang/support_attention_cuda
277648d8
update doc
5253dd05
removed deprecated functions in onnx
4db63bcb
Revert "removed deprecated functions in onnx"
0a7e5f9d
Merge branch 'main' into titaiwang/support_attention_cuda
6b18bb46
fix qkv space - support 3d default
a1ed3d9d
turn 4d to tru on disable cuda
b4629300
refactor attn_mask
0494e955
simplify
2dc706a5
Merge branch 'main' into titaiwang/support_attention_cuda
5f0b6cd3
support 4d and fix attn_mask bug
88e631cd
disregard softcap and softmax_precision
000d394d
Merge branch 'main' into titaiwang/support_attention_cuda
739e88fc
fix offset in is_causal
6d6d4782
add past_seq_length to softmax bias add for causal
792445a2
resolve merge conflict
a26d812c
update failing cuda tests
fbbf0b5c
titaiwangms
marked this pull request as ready for review 83 days ago
Merge branch 'main' into titaiwang/support_attention_cuda
70103087
delete past_sequence_length and use flag output_is_Q_K_V_BNSH
2c793b67
add kv_sequence_length to softmax
ab41d04c
Merge branch 'main' into titaiwang/support_attention_cuda
4a8f502b
remove kv_sequence_length in softmax and disable cross attn causal tests
2a9167e0
titaiwangms
added this to the 1.24.0 milestone 55 days ago
titaiwangms
removed this from to the 1.24.0 milestone 55 days ago
tianleiwu
changed the title Attenion(23) CUDA Attention(23) CUDA 55 days ago
tianleiwu
dismissed these changes
on 2026-01-14
address reviews - comments
ff7e7673
Merge branch 'main' into titaiwang/support_attention_cuda
c0018852
titaiwangms
dismissed their stale review
via c0018852
54 days ago
titaiwangms
added this to the 1.24.0 milestone 54 days ago
tianleiwu
approved these changes
on 2026-01-14
titaiwangms
deleted the titaiwang/support_attention_cuda branch 54 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub