PR #5488 ggml : add ALiBi support for ggml_soft_max_ext

ggml : add ALiBi support for ggml_soft_max_ext #5488

ggerganov merged 18 commits into master from gg/refactor-alibi

ggml : avoid recomputing alibi slopes (CPU)

7e0c3778

llama : reuse hparams.f_max_alibi_bias in all cases

6ca762ec

ggml : support alibi bias in ggml_soft_max_ext (CPU + Metal)

5055a0c9

ggerganov commented on 2024-02-14

ggml : handle all SRCs (do not break on first null)

69da57c0

tests : do not use slope for large soft_max

5261fb2d

ggml : alternative ALiBi without extra tensor

97d6a0cc

cuda : add ALiBi support in ggml_soft_max_ext

a0f8a93b

ggml : deprecate ggml_alibi

0fe2d560

ggml : support multi-sequence ALiBi (Metal)

996f7f4e

ggerganov force pushed from 733c4774 to 996f7f4e 1 year ago

cuda : add multi-seq ALiBi + remote F16 soft_max

8c7b9ee2

ggml : update deprecation message

e3d4b99a

ggml : fix pos ptr when no ALiBi

b2c055b8

ggerganov marked this pull request as ready for review 1 year ago

ggerganov requested a review from

slaren 1 year ago

ggerganov requested a review from

JohannesGaessler 1 year ago

JohannesGaessler approved these changes on 2024-02-15

JohannesGaessler commented on 2024-02-15

cuda : fix performance (pow -> powf)

113e0d5d

cuda : precompute ALiBi constants

7fd024c5

slaren commented on 2024-02-15

Merge branch 'master' into gg/refactor-alibi

ac91033c

metal : pre-compute ALiBi slopes

833490b1

llama : init kq_pos only if needed

1657f92d

ggerganov requested a review from

slaren 1 year ago

test-backend-ops : add null pos test to soft_max

aaa20e1f

slaren approved these changes on 2024-02-17

ggerganov merged 8f1be0d4 into master 1 year ago

ggerganov deleted the gg/refactor-alibi branch 1 year ago

Reviewers

slaren

JohannesGaessler

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

llama.cpp ggml : add ALiBi support for ggml_soft_max_ext #5488 Merged

ggml : add ALiBi support for ggml_soft_max_ext #5488

llama.cpp
ggml : add ALiBi support for ggml_soft_max_ext
#5488

Merged