Port of multi_margin_loss from TH to ATen (CPU) (#28062)
Summary:
This is a port of the existing TH CPU C MultiMarginCriterion to function multi_margin_loss for ATen. ~~The ATen/C++ version is unfortunately significantly slower than the original. It is currently unclear to me what causes the performance degradation since the Tensor access is raw-pointer based similar to the original C implementation. (A first implementation I had created using TensorAccessor was even about 2x slower than the one in this PR).~~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28062
Differential Revision: D17980636
Pulled By: ezyang
fbshipit-source-id: bba27a13436adff5e687d95cc984ec2386ce7a73