stirde_properties fix (#77460)
Fixes part of #6015
profiled permutation order is wrong and nvfuser generates output in wrong memory format. Though this problem doesn't seem to cause any issue with the test (except a graceful fallback path taken by CudaFusionGuard).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77460
Approved by: https://github.com/davidberard98, https://github.com/ngimel