Fix problem in NHWC max_pool2d; use accumulate type in NHWC max_pool2d (#34934)
Summary:
This PR would fix https://github.com/pytorch/pytorch/issues/34736. Both code snippet in that issue can now execute normally. More tests are also added.
This PR is a follow-up on https://github.com/pytorch/pytorch/issues/34519, where one variable was mistakenly missed when updating the max_pool2d kernel.
This PR also uses accumulate type of scalar_t in the backward kernel, which resolves the numerical precision issue when stride < kernel_size on fp16.
cc csarofeen ptrblck jjsjann123 VitalyFedyunin ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34934
Differential Revision: D20512062
Pulled By: VitalyFedyunin
fbshipit-source-id: a461ebbb3e3684aa183ae40e38d8f55bb6f4fee1