text-generation-inference
Add chunked attn for L4
#3162
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
Add chunked attn for L4
#3162
mht-sharma
wants to merge 5 commits into
main
from
add_chunked_atn
add fix
33a7ec57
reverse flash causal change
3f343cdb
support cuda graphs
d2f8caff
fix bt
a7353c35
force attn to flashdecoding
2a10a28d
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub