Run optimizer tests with fake tensors (#89643)
This is a slight regression: RAdam and Adagrad don't appear to
trace at all under fake tensors. But I think this is a more accurate
reflection of the current state of affairs.
Along the way fix some problems on the fake tensor path.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89643
Approved by: https://github.com/anjali411