[cuDNN] Enable cudnn_batchnorm_spatial_persistent for BatchNorm3d channels_last_3d (#59129)
Summary:
This PR enables the use of cuDNN BatchNorm spatial persistent algorithm for BatchNorm3d (5-D tensor) in channels_last_3d format, aka NDHWC. Performance and numerical accuracy are tested.
- [x] Performance check for common shapes.
- [x] Numerical accuracy check for (1 million) random shapes
https://github.com/xwang233/code-snippet/tree/master/batchnorm3d-channels-last/A100
https://github.com/xwang233/code-snippet/tree/master/batchnorm3d-channels-last/V100
- [ ] Convergence check for common 3D models
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59129
Reviewed By: mruberry
Differential Revision: D29593309
Pulled By: ngimel
fbshipit-source-id: 2caf282c6cf2f426aa14a24f94e6bddada68ddac