reduce igamma instantiations (#70666)
Summary:
Don't compile scalar versions of the kernel (there is no scalar overload), combine igamma and igammac kernels.
Igamma cubin size 10 MB -> 2 MB on V100
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70666
Reviewed By: malfet
Differential Revision: D33431359
Pulled By: ngimel
fbshipit-source-id: 440998f751251be274f40dd035efba08b8969192