nvprim python runtime dtype correctness patch (#88452)
Cherry-picking: https://github.com/csarofeen/pytorch/pull/2133
- [x] casts FusionDefinition output to original dtype recorded in the GraphModule
- [x] add a python repro with dynamo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88452
Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry