llama.cpp
vulkan: Update topk_moe fusion to handle gpt's late softmax
#16656
Merged

vulkan: Update topk_moe fusion to handle gpt's late softmax #16656

0cc4m merged 5 commits into ggml-org:master from jeffbolznv:topk_gpt
jeffbolznv
jeffbolznv jeffbolznv requested a review from 0cc4m 0cc4m 61 days ago
github-actions github-actions added Vulkan
github-actions github-actions added ggml
jeffbolznv jeffbolznv force pushed 58 days ago
jeffbolznv jeffbolznv requested a review from ggerganov ggerganov 58 days ago
jeffbolznv jeffbolznv requested a review from slaren slaren 58 days ago
jeffbolznv
0cc4m
0cc4m commented on 2025-10-25
am17an
jeffbolznv
jeffbolznv vulkan: Update topk_moe fusion to handle gpt's late softmax
6cccaef3
jeffbolznv Add ggml_check_edges
81853b56
jeffbolznv Add sync logging to show fusion effects
180eef4d
jeffbolznv handle clamp added in #16655
b046c734
jeffbolznv jeffbolznv force pushed to b046c734 53 days ago
jeffbolznv
github-actions github-actions added testing
0cc4m
0cc4m approved these changes on 2025-10-29
0cc4m
slaren
slaren approved these changes on 2025-10-29
jeffbolznv Update ggml/src/ggml-impl.h
832ea836
0cc4m 0cc4m merged 10fcc412 into master 50 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone