Move the CUDA implementation of sqrt to ATen. (#27372)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27372
Fix #24638
Test Plan: Imported from OSS
Differential Revision: D18037944
Pulled By: VitalyFedyunin
fbshipit-source-id: d3dbbc167954c7bbee25be13b5b669433bca6ee5