llama-cpp-python
perf: vectorize KV cache prefix matching with numpy
#2179
Open

perf: vectorize KV cache prefix matching with numpy #2179

nausicaalii
nausicaalii perf: vectorize prefix matching with numpy
d815bba0
nausicaalii refactor: deduplicate prefix matching and eliminate .tolist() overhead
aeb7d7cf

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone