fix checkpointing/loading of z0+bf16 (#7786)
When using `bf16=True` with `zero_optimization.stage=0`, the optimizer
state is not saved or loaded during checkpointing. The optimizer's
`step` counter and other states (`exp_avg`, `exp_avg_sq`) are lost after
loading a checkpoint.
This PR addresses the issue by fixing a flag indicating the config and
adds a test arg to cover the problematic case.
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>