Add chunked attn for L4 #3162

mht-sharma wants to merge 5 commits into main from add_chunked_atn
mht-sharma
mht-sharma add fix
33a7ec57
mht-sharma reverse flash causal change
3f343cdb
mht-sharma support cuda graphs
d2f8caff
mht-sharma fix bt
a7353c35
mht-sharma force attn to flashdecoding
2a10a28d

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone