llama.cpp
10fcc412 - vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656)

Commit
48 days ago
vulkan: Update topk_moe fusion to handle gpt's late softmax (#16656) * vulkan: Update topk_moe fusion to handle gpt's late softmax Based on #16649. * Add ggml_check_edges * Add sync logging to show fusion effects * handle clamp added in #16655 * Update ggml/src/ggml-impl.h Co-authored-by: Diego Devesa <slarengh@gmail.com>
Author
Parents
Loading