text-generation-inference
feat(server): pre-allocate past key values for flash causal LM
#412
Merged

feat(server): pre-allocate past key values for flash causal LM #412

OlivierDehaene merged 8 commits into main from feat/faster_flash_cache
OlivierDehaene
OlivierDehaene wip
5ff2dc91
OlivierDehaene working rw 7b
c9e74717
OlivierDehaene working
bfd6928c
OlivierDehaene fix
3fc87f93
OlivierDehaene add other models
c509e4e7
OlivierDehaene update commit
afdfe433
OlivierDehaene revert some changes
92a74ea0
OlivierDehaene OlivierDehaene force pushed from 0994ef7b to 92a74ea0 2 years ago
OlivierDehaene faster
4b9ebb0a
OlivierDehaene OlivierDehaene merged 5ce89059 into main 2 years ago
OlivierDehaene OlivierDehaene deleted the feat/faster_flash_cache branch 2 years ago
njhill
OlivierDehaene

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone