llama.cpp
fd6ae4ca - Tensor-parallel: Fix delayed AllReduce on Gemma-4 MoE (#22129)

Commit
46 days ago
Tensor-parallel: Fix delayed AllReduce on Gemma-4 MoE (#22129) * Fix delayed AllReduce on Gemma-4 MoE Skip forward past nodes that don't consume the current one, and allow a chain of MULs. * Check for all sources before skipping nodes * Address review comments
Author
Parents
Loading