pytorch
3d745508 - String optimizations related to serialization. (#28230)

Commit
6 years ago
String optimizations related to serialization. (#28230) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28230 This change improves the pickling small data benchmark by roughly 30%. (25.8usec -> 18.05usec). One of the main issues was that we were spending 25%+ of the cpu profile time in std::[o]stringstream constructors alone. Two main parts - Change some std::stringstream to std::ostringstream, when they showed up on hot-ish paths, and it was trivial to convert them. Roughly 27% of the std::stringstream constructor time is spent building the constituent std::basic_istream. If the istream isn't needed, don't construct it. - For a couple of very hot paths (e.g. Pickler::pushGlobal), just convert to traditional string::append(). std::ostringstream is convenient, but not particularly efficient. ghstack-source-id: 92153103 Test Plan: Benchmarking: buck build mode/opt experimental/jeremyl/c2:SerializationBench Correctness: buck test mode/dev-nosan caffe2/test/... Differential Revision: D17982181 fbshipit-source-id: 7fd4d267293231244c10c1e5b8f4951a7a3d852f
Author
Jeremy Lilley
Parents
Loading