text-generation-inference
feat(server): use cuda graph in logits warping
#302
Merged

feat(server): use cuda graph in logits warping #302

OlivierDehaene merged 5 commits into main from feat/token_graph
OlivierDehaene
OlivierDehaene add cuda graphs to token warping
e2727387
OlivierDehaene add cpu support
1df2aa03
OlivierDehaene fix multinomial gpu cpu sync
3248fdfb
OlivierDehaene OlivierDehaene force pushed from eb3af06b to 3248fdfb 3 years ago
OlivierDehaene cleanup
a944dd0f
OlivierDehaene inline
b1f80702
OlivierDehaene OlivierDehaene merged a6c18c39 into main 3 years ago
OlivierDehaene OlivierDehaene deleted the feat/token_graph branch 3 years ago
njhill
OlivierDehaene

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone