llama.cpp
ggml : support broadcast for ggml_soft_max_ext and ggml_flash_attn_ext
#14435
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
4
Changes
View On
GitHub
ggml : support broadcast for ggml_soft_max_ext and ggml_flash_attn_ext
#14435
ggerganov
merged 4 commits into
master
from
gg/ggml-batch-soft-max-ops
github-actions
added
testing
github-actions
added
ggml
github-actions
added
Apple Metal
ggerganov
force pushed
from
e6faa451
to
236682a7
130 days ago
github-actions
added
Nvidia GPU
github-actions
added
Vulkan
github-actions
added
Ascend NPU
ggerganov
force pushed
from
236682a7
to
572a062e
130 days ago
ggerganov
force pushed
from
572a062e
to
852529e9
130 days ago
ggerganov
force pushed
from
852529e9
to
bdfd7b75
130 days ago
ggerganov
marked this pull request as ready for review
130 days ago
github-actions
added
SYCL
ggerganov
force pushed
from
bdfd7b75
to
461cb2f3
129 days ago
JohannesGaessler
requested a review
from
JohannesGaessler
126 days ago
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (#14435)
32366701
vulkan: support softmax/FA batch and broadcast (#14449)
b7265648
CUDA: broadcasting for FlashAttention mask (#14500)
3045a1eb
CUDA: add softmax broadcast (#14475)
3b38afdf
ggerganov
force pushed
from
be8d4700
to
3b38afdf
126 days ago
ggerganov
merged
55a1c5a5
into master
126 days ago
ggerganov
deleted the gg/ggml-batch-soft-max-ops branch
126 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
JohannesGaessler
Assignees
No one assigned
Labels
testing
Nvidia GPU
Vulkan
ggml
SYCL
Apple Metal
Ascend NPU
Milestone
No milestone
Login to write a write a comment.
Login via GitHub