Align AIME pass@1 with literature (#666)
Recent papers like [SimpleRL-Zoo](https://arxiv.org/pdf/2503.18892) and [VAPO](https://arxiv.org/pdf/2504.05118) have adopted `n=32` as the default estimate for AIME24.
This PR bumps our default to the same value so we align with what others report.