Fix ZeRO stage to choose BF16 optimizer in test (#7803)
Use ZeRO stage 1 to use BF16 optimizer.
(We should have switched to ZeRO1 in #7788, but I missed the change.
@sfc-gh-truwase)
- #7790 removed the fallback that allowed bf16 model + fp32 grad
accumulation without ZeRO, so that combo now raises NotImplementedError.
- #7788 changed test_bf16_optimizer_fragments to force BF16_Optimizer by
setting grad_accum_dtype=fp32, but it kept ZeRO stage 0, which is now
invalid after #7790.
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>