test(ark): set MoE prefill perf token totals to 2K and 4K
In auto_round_extension/ark/test/test_moe_prefill_perf.py the
PREFILL_SHAPES rows were summing to 252-610 tokens, which is too small
to represent realistic MoE prefill workloads. Rework the matrix so every
row totals either 2048 (2K) or 4096 (4K) tokens across all experts,
giving a 2K and a 4K variant for each (E, N, K) shape. Tokens are
distributed evenly across experts, except for the "uneven" rows which
keep a skewed distribution to exercise load imbalance. Labels gain a
"2K"/"4K" suffix (kept at 14 chars to fit the existing column width).