whisper : add batched decoding #1486
whisper : add whisper_batch
3cbaaed0
whisper : move kv_self to whisper_state
8b943f98
whisper : full batched decoding support
91096daa
whisper : fix memory leak in whisper_batch
3d24e35f
whisper : fix mem leak again + remove oboslete function
b2123cb4
ggerganov
marked this pull request as ready for review 2 years ago
whisper : clear kv cache when using whisper_decode API
d7760357
whisper : speed-up sampling
9006946e
whisper : fix decoders initializer
3ed9af34
bench : add batch size 5 bench
ae1bd690
whisper : add comment about the KV cache size
6c8a003a
whisper : add check for max number of decoders
820f4589
whisper : avoid starting sampling threads with bs=1
4c245ea1
whisper : enable beam-search by default
b7c82a37
cuda : sync llama.cpp fixes
270b1e48
ggerganov
merged
b6c5f49b
into master 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub