Set the correct grad mode for train and eval correctness tests
Summary:
In the correctness testing code, we should keep use the same grad mode as the actual test.
This fixes the correctness check of mimo_cmf_30x train test without cudagraph.
Also, this diff migrates to use the existing `torch._dynamo.testing.same` function to check the correctness.
Reviewed By: bertmaher
Differential Revision: D46164558
fbshipit-source-id: 72981690d675b6139c7f5615ba41b2df18923a2d