SemanticDiff

pytorch
92562046 - Optimize __dlpack_device__ performance (#86665)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

1 year ago

Optimize __dlpack_device__ performance (#86665) This can be critical when processing a large number of tensors ```bash python -m timeit --setup 'import torch; t = torch.empty(1000, device="cuda")' 't.__dlpack_device__()' ``` based on 1.12.1: before: 100000 loops, best of 5: 2.32 usec per loop after: 500000 loops, best of 5: 844 nsec per loop Pull Request resolved: https://github.com/pytorch/pytorch/pull/86665 Approved by: https://github.com/SunDoge, https://github.com/soulitzer

Author

huww98

huww98

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading