auto-round
cd4f4417 - test(ark): set MoE prefill perf token totals to 2K and 4K

Commit
9 days ago
test(ark): set MoE prefill perf token totals to 2K and 4K In auto_round_extension/ark/test/test_moe_prefill_perf.py the PREFILL_SHAPES rows were summing to 252-610 tokens, which is too small to represent realistic MoE prefill workloads. Rework the matrix so every row totals either 2048 (2K) or 4096 (4K) tokens across all experts, giving a 2K and a 4K variant for each (E, N, K) shape. Tokens are distributed evenly across experts, except for the "uneven" rows which keep a skewed distribution to exercise load imbalance. Labels gain a "2K"/"4K" suffix (kept at 14 chars to fit the existing column width).
Author
Parents
Loading