text-generation-inference
feat(server): use cuda graph in logits warping
#302
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
feat(server): use cuda graph in logits warping
#302
OlivierDehaene
merged 5 commits into
main
from
feat/token_graph
add cuda graphs to token warping
e2727387
add cpu support
1df2aa03
fix multinomial gpu cpu sync
3248fdfb
OlivierDehaene
force pushed
from
eb3af06b
to
3248fdfb
3 years ago
cleanup
a944dd0f
inline
b1f80702
OlivierDehaene
merged
a6c18c39
into main
3 years ago
OlivierDehaene
deleted the feat/token_graph branch
3 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub