llama.cpp
CUDA: fuse adds, fuse add with rms norm
#15631
Merged

CUDA: fuse adds, fuse add with rms norm #15631

am17an merged 7 commits into ggml-org:master from am17an:rms_norm_fused_add
am17an
am17an CUDA: fused add with rms_norm_mul
fbbd94c9
am17an Non-broadcast fuse works
2dcc02d1
am17an Add fused adds
69bcd48c
am17an am17an requested a review from JohannesGaessler JohannesGaessler 293 days ago
am17an format
4d105783
am17an am17an force pushed to 4d105783 293 days ago
am17an Remove n_fuse from template params
5adf50ed
ggerganov
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
JohannesGaessler
JohannesGaessler commented on 2025-08-28
JohannesGaessler
JohannesGaessler
JohannesGaessler commented on 2025-08-28
am17an
am17an
am17an am17an force pushed 293 days ago
am17an Address review comments
b64ba1cc
am17an am17an force pushed to b64ba1cc 293 days ago
JohannesGaessler
JohannesGaessler commented on 2025-08-28
JohannesGaessler
JohannesGaessler
JohannesGaessler approved these changes on 2025-08-28
am17an Move template inside binbcast
f4488188
am17an am17an merged 009b709d into master 293 days ago
am17an am17an deleted the rms_norm_fused_add branch 293 days ago
ORippler
ORippler commented on 2025-08-29
CISC
CISC
CISC commented on 2025-08-29
am17an
ORippler
ORippler commented on 2025-08-29
CISC
CISC
am17an
CISC

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone