pytorch
e1cfac42 - [jiterator] De-template launch_jitted_reduce_kernel (#80138)

Commit

2 years ago

[jiterator] De-template launch_jitted_reduce_kernel (#80138) As with `jitted_gpu_kernel_impl`, this 1. Hoists static variables out and into a parent funciton 2. Moves template arguments into the `jit::KernelDescriptor` struct, as well as changing `vt0` to just be a runtime argument 3. Changes the types of pass-through arguments to `void*` On my build I see a 0.5 MB decrease in binary size for `libtorch_cuda.so`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80138 Approved by: https://github.com/ngimel

Author

peterbell10

Committer

pytorchmergebot

Parents

df665b1a

pytorch e1cfac42 - [jiterator] De-template launch_jitted_reduce_kernel (#80138)

pytorch
e1cfac42 - [jiterator] De-template launch_jitted_reduce_kernel (#80138)