[functorch] Implemented nll loss through decomposition (pytorch/functorch#208)
* WIP on nll_loss br implementation
* WIP on nll_loss (2)
* Updated tests
* Removed commented code
* Updated decomposition
* Fixed total_weight thanks to Richard's suggestion
* Removed nll_loss_nd and renamed nll_loss_forward_plumbing -> nll_loss_forward_decomposition