Fix the slowness of mvn's log_prob (#17294)
Summary:
This PR addresses the slowness of MVN's log_prob as reported in #17206.
t-vi I find it complicated to handle permutation dimensions if we squeeze singleton dimensions of bL, so I leave it as-is and keep the old approach. What do you think?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17294
Differential Revision: D14157292
Pulled By: ezyang
fbshipit-source-id: f32590b89bf18c9c99b39501dbee0eeb61e130d0