whisper.cpp
whisper : use flash attention
#2152
Merged

whisper : use flash attention #2152

ggerganov merged 9 commits into master from gg/flash-attn
ggerganov
ggerganov ggerganov force pushed from 0935288f to 5bc21a58 1 year ago
ggerganov ggerganov force pushed from 3e7c3518 to 7017fa5a 1 year ago
ggerganov ggerganov force pushed from 497dbf40 to bfbfde85 1 year ago
ggerganov whisper : use flash attention in the encoder
7c94a111
ggerganov whisper : add kv_pad
2877b026
ggerganov whisper : remove extra backend instance (huh?)
4caa64b7
ggerganov whisper : use FA for cross-attention
db3872d0
ggerganov whisper : use FA for self-attention
07d616ac
ggerganov whisper : simplify encoder FA
066b544f
ggerganov ggerganov force pushed from bfbfde85 to 066b544f 1 year ago
ggerganov whisper : add flash_attn runtime parameter
22c96b47
ggerganov ggerganov marked this pull request as ready for review 1 year ago
ggerganov scripts : add bench log
5dfb63e6
ggerganov
ggerganov scripts : add M1 Pro bench log
62287957
ggerganov ggerganov merged 7094ea5e into master 1 year ago
ggerganov ggerganov deleted the gg/flash-attn branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone