llama.cpp
vulkan: Update topk_moe fusion to handle gpt's late softmax
#16656

Merged

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

0cc4m merged 5 commits into ggml-org:master from jeffbolznv:topk_gpt

jeffbolznv requested a review from

0cc4m 61 days ago

github-actions added Vulkan

github-actions added ggml

jeffbolznv force pushed 58 days ago

jeffbolznv requested a review from

ggerganov 58 days ago

jeffbolznv requested a review from

slaren 58 days ago

0cc4m commented on 2025-10-25

vulkan: Update topk_moe fusion to handle gpt's late softmax

6cccaef3

Add ggml_check_edges

81853b56

Add sync logging to show fusion effects

180eef4d

handle clamp added in #16655

b046c734

jeffbolznv force pushed to b046c734 53 days ago

github-actions added testing

0cc4m approved these changes on 2025-10-29

slaren approved these changes on 2025-10-29

Update ggml/src/ggml-impl.h

832ea836

0cc4m merged 10fcc412 into master 50 days ago

Reviewers

slaren

0cc4m

ggerganov

Assignees

No one assigned

Labels

testing Vulkan ggml

Milestone

No milestone