[dte] fastpath implementations for mulgrad / divgrad (3/x) (#62437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62437
In this diff we add a broadcast fastpath for MulGradient and DivGradient ops, whose tests we update to exercise the new functionality.
Test Plan: Added test cases to elementwise ops (which will exercise the new MulGradient / DivGradient broadcast fastpath functionality) that will be run by CI. It's worth noting there's still no code (outside of the new test cases) that takes the new code paths added -- the user must explicitly request allow_broadcast_fastpath=True, and nothing outside of the added tests currently does so.
Differential Revision: D29938273
fbshipit-source-id: 281c1a109e38c25b9bf9ff8d832de60ac3c231a9