More forward AD formulas
This PR:
- Corrects the forward AD formula of `torch.sgn`.
- The reason why we can't use `auto_element_wise` for this operations is rather subtle. I left a comment.
- This, in turn, fixes a problem we had in forward-over-backward for `linalg.svd` and other spectral decompositions (and `norm`, `linalg.norm`, `linalg.matrix_norm`) that were using `torch.abs` (whose derivative is given by `torch.sgn`.
- Implement the formula for a number of missing operations `nansum`, `amax`, `amin`...
- Simplified a few formulas, most notably the forward AD for `div` and the derivative of `norm`, `linalg.norm` and `vector_norm` for `ord=+-inf`.
- Correct the formula for `mean`, `std_mean`, `var_mean` when `dim` is provided and equal to `()` (or `None`)
- A few minor improvements to `sum_backward`, `unsqueeze_multiple` and formulas depending on them
- Fix the derivatives of `std_mean` and `std_var` (complex support,
ASAN, forward AD...)
Fixes: https://github.com/pytorch/pytorch/issues/67539
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80082
Approved by: https://github.com/zou3519