pytorch
e2125436 - Improve float pickling speed. (#28553)

Commit
5 years ago
Improve float pickling speed. (#28553) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28553 This change improves double pickling in 1M double list microbenchmark by roughly 40% (33msec -> 20msec). The main benefit is avoiding per-byte bounds checks, so we only bounds-check 2 times rather than 9 times. Unpickle is already doing something reasonable, so no need to change. fwiw, putting the swapping logic in a separate func/lambda provided roughly 20% better results, consistently when microbenchmarking. Looking at the objdump disassembly, gcc somehow generates better code when it's separated. ghstack-source-id: 92585739 Test Plan: Benchmarks: buck build mode/opt experimental/jeremyl/c2:SerializationBench buck-out/opt/gen/experimental/jeremyl/c2/SerializationBench --bm_regex=.*Float.* Correctness: buck build mode/dev-nosan caffe2/test/... Differential Revision: D18089481 fbshipit-source-id: a5f39e5d38c432893844241a7cce244831037e1f
Author
Parents
Loading