SemanticDiff

pytorch
87f9b55a - Use explicit templates in `gpu_kernel_with_scalars` (#40992)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

Use explicit templates in `gpu_kernel_with_scalars` (#40992) Summary: This trick should have no effect on performance, but it reduces size of kernels using the template by 10% For example, sizeof(BinaryMulDivKernel.cu.o) compiled by CUDA-10.1 toolchain for sm_75 before the change was 4.2Mb, after 3.8Mb Pull Request resolved: https://github.com/pytorch/pytorch/pull/40992 Differential Revision: D22398733 Pulled By: malfet fbshipit-source-id: 6576f4da00dc5fc2575b2313577f52c6571d5e6f

Author

malfet

malfet

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading