llama.cpp
ggml : add ALiBi support for ggml_soft_max_ext
#5488
Merged

ggml : add ALiBi support for ggml_soft_max_ext #5488

ggerganov merged 18 commits into master from gg/refactor-alibi
ggerganov
ggerganov ggml : avoid recomputing alibi slopes (CPU)
7e0c3778
ggerganov llama : reuse hparams.f_max_alibi_bias in all cases
6ca762ec
ggerganov ggml : support alibi bias in ggml_soft_max_ext (CPU + Metal)
5055a0c9
ggerganov
ggerganov commented on 2024-02-14
slaren
ggerganov
ggerganov ggml : handle all SRCs (do not break on first null)
69da57c0
ggerganov tests : do not use slope for large soft_max
5261fb2d
ggerganov ggml : alternative ALiBi without extra tensor
97d6a0cc
ggerganov cuda : add ALiBi support in ggml_soft_max_ext
a0f8a93b
ggerganov
JohannesGaessler
JohannesGaessler
ggerganov
JohannesGaessler
ggerganov ggml : deprecate ggml_alibi
0fe2d560
ggerganov ggml : support multi-sequence ALiBi (Metal)
996f7f4e
ggerganov ggerganov force pushed from 733c4774 to 996f7f4e 1 year ago
ggerganov cuda : add multi-seq ALiBi + remote F16 soft_max
8c7b9ee2
ggerganov ggml : update deprecation message
e3d4b99a
ggerganov ggml : fix pos ptr when no ALiBi
b2c055b8
ggerganov ggerganov marked this pull request as ready for review 1 year ago
ggerganov ggerganov requested a review from slaren slaren 1 year ago
ggerganov ggerganov requested a review from JohannesGaessler JohannesGaessler 1 year ago
slaren
ggerganov
JohannesGaessler
JohannesGaessler approved these changes on 2024-02-15
slaren
slaren
JohannesGaessler
JohannesGaessler commented on 2024-02-15
ggerganov
ggerganov cuda : fix performance (pow -> powf)
113e0d5d
slaren
ggerganov cuda : precompute ALiBi constants
7fd024c5
slaren
slaren
slaren commented on 2024-02-15
ggerganov Merge branch 'master' into gg/refactor-alibi
ac91033c
ggerganov metal : pre-compute ALiBi slopes
833490b1
ggerganov
ggerganov llama : init kq_pos only if needed
1657f92d
ggerganov ggerganov requested a review from slaren slaren 1 year ago
slaren test-backend-ops : add null pos test to soft_max
aaa20e1f
slaren
slaren approved these changes on 2024-02-17
ggerganov ggerganov merged 8f1be0d4 into master 1 year ago
ggerganov ggerganov deleted the gg/refactor-alibi branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone