whisper : use flash attention #2152
ggerganov
force pushed
from
0935288f
to
5bc21a58
1 year ago
ggerganov
force pushed
from
3e7c3518
to
7017fa5a
1 year ago
ggerganov
force pushed
from
497dbf40
to
bfbfde85
1 year ago
whisper : use flash attention in the encoder
7c94a111
whisper : add kv_pad
2877b026
whisper : remove extra backend instance (huh?)
4caa64b7
whisper : use FA for cross-attention
db3872d0
whisper : use FA for self-attention
07d616ac
whisper : simplify encoder FA
066b544f
ggerganov
force pushed
from
bfbfde85
to
066b544f
1 year ago
whisper : add flash_attn runtime parameter
22c96b47
ggerganov
marked this pull request as ready for review 1 year ago
scripts : add bench log
5dfb63e6
scripts : add M1 Pro bench log
62287957
ggerganov
merged
7094ea5e
into master 1 year ago
ggerganov
deleted the gg/flash-attn branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub