transformers
aab08783 - Skip non-selected experts for mixtral and qwen2_moe (#32429)

Commit
253 days ago
Skip non-selected experts for mixtral and qwen2_moe (#32429) * Skip non-selected experts for mixtral and qwen2_moe * Fix: tensor tolist() * WIP: tokenization test * fix modular source of truth * nits --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Author
Parents
Loading