ggml : add ALiBi support for ggml_soft_max_ext #5488
ggml : avoid recomputing alibi slopes (CPU)
7e0c3778
llama : reuse hparams.f_max_alibi_bias in all cases
6ca762ec
ggml : support alibi bias in ggml_soft_max_ext (CPU + Metal)
5055a0c9
ggml : handle all SRCs (do not break on first null)
69da57c0
tests : do not use slope for large soft_max
5261fb2d
ggml : alternative ALiBi without extra tensor
97d6a0cc
cuda : add ALiBi support in ggml_soft_max_ext
a0f8a93b
ggml : deprecate ggml_alibi
0fe2d560
ggml : support multi-sequence ALiBi (Metal)
996f7f4e
ggerganov
force pushed
from
733c4774
to
996f7f4e
1 year ago
cuda : add multi-seq ALiBi + remote F16 soft_max
8c7b9ee2
ggml : update deprecation message
e3d4b99a
ggml : fix pos ptr when no ALiBi
b2c055b8
ggerganov
marked this pull request as ready for review 1 year ago
cuda : fix performance (pow -> powf)
113e0d5d
cuda : precompute ALiBi constants
7fd024c5
slaren
commented
on 2024-02-15
Merge branch 'master' into gg/refactor-alibi
ac91033c
metal : pre-compute ALiBi slopes
833490b1
llama : init kq_pos only if needed
1657f92d
test-backend-ops : add null pos test to soft_max
aaa20e1f
slaren
approved these changes
on 2024-02-17
ggerganov
merged
8f1be0d4
into master 1 year ago
ggerganov
deleted the gg/refactor-alibi branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub