text-generation-inference
Add chunked attn for L4
#3162
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
Commits
add fix
mht-sharma
committed
259 days ago
reverse flash causal change
mht-sharma
committed
259 days ago
support cuda graphs
mht-sharma
committed
258 days ago
fix bt
mht-sharma
committed
258 days ago
force attn to flashdecoding
mht-sharma
committed
258 days ago
Loading