[Easy][FSDP] Add `zero_grad()` to unit test train loop (#80087)
I wanted to add in a `zero_grad()` call into the train step for `test_fsdp_optim_state.py` to make it more reflective of normal use cases instead of exercising gradient accumulation.
Differential Revision: [D37726058](https://our.internmc.facebook.com/intern/diff/D37726058)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80087
Approved by: https://github.com/rohan-varma