llama.cpp
cbaadc92 - grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609)

Commit
1 year ago
grammars: 1.5x faster inference w/ complex grammars (vector reserves / reuses) (#6609) * grammars: reserve rejects & next candidates * grammars: reuse new_stacks * grammars: fix missing sig change in llama.h * grammars: fix test (api changed) * grammars: update gbnf-validator.cpp * grammars: simpler syntax (no swap)
Author
Parents
Loading