Make l1_loss composite
Fixing the forward AD for `sgn` in the next PR of this stack uncovered a
number of issues with the derivatives of `l1_loss`. Upon inspection,
`l1_loss` was just implemented as a composite function, but it was not
differentiable. This PR makes it a fully differentiable function.
As a side note, `l1_loss_out` was incorrect in a number of ways. Even
more, it is not exposed to the public as `F.l1_loss` does not accept an
`out=` parameter. As such it is not even tested. I wonder how useful is
to have `out=` variants for loss functions if we don't expose them at
all. Even more, I wonder how useful is to have `_out` variants for loss
functions, given that their most normal use case is to return just a
real number cc jbschlosser
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79804
Approved by: https://github.com/zou3519, https://github.com/malfet