PR #15631 CUDA: fuse adds, fuse add with rms norm

CUDA: fuse adds, fuse add with rms norm #15631

am17an merged 7 commits into ggml-org:master from am17an:rms_norm_fused_add

CUDA: fused add with rms_norm_mul

fbbd94c9

Non-broadcast fuse works

2dcc02d1

Add fused adds

69bcd48c

am17an requested a review from

JohannesGaessler 293 days ago

format

4d105783

am17an force pushed to 4d105783 293 days ago

Remove n_fuse from template params

5adf50ed

github-actions added Nvidia GPU

github-actions added ggml

JohannesGaessler commented on 2025-08-28

am17an force pushed 293 days ago

Address review comments

b64ba1cc

am17an force pushed to b64ba1cc 293 days ago

JohannesGaessler commented on 2025-08-28

JohannesGaessler approved these changes on 2025-08-28

Move template inside binbcast

f4488188

am17an merged 009b709d into master 293 days ago

am17an deleted the rms_norm_fused_add branch 293 days ago

ORippler commented on 2025-08-29

CISC commented on 2025-08-29

ORippler commented on 2025-08-29

Reviewers

JohannesGaessler

ORippler

CISC

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone