pytorch
e8c6a650 - Adds grid_sampler to autocast fp32 list for 1.9 (#58679)

Commit
3 years ago
Adds grid_sampler to autocast fp32 list for 1.9 (#58679) Summary: Temporary fix for https://github.com/pytorch/pytorch/issues/42218. Numerically, grid_sampler should be fine in fp32 or fp16. So grid_sampler really belongs on the promote list. But performancewise, native grid_sampler backward kernels use gpuAtomicAdd, which is notoriously slow in fp16. So the simplest functionality fix is to put grid_sampler on the fp32 list. In https://github.com/pytorch/pytorch/pull/58618 I implement the right long-term fix (refactoring kernels to use fp16-friendly fastAtomicAdd and moving grid_sampler to the promote list). But that's more invasive, and for 1.9 ngimel says this simple temporary fix is preferred. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58679 Reviewed By: soulitzer Differential Revision: D28576559 Pulled By: ngimel fbshipit-source-id: d653003f37eaedcbb3eaac8d7fec26c343acbc07
Author
Parents
Loading