pytorch
665fd30a - Load CPU MLModel first, and configured MLModel async (#80941)

Commit

2 years ago

Load CPU MLModel first, and configured MLModel async (#80941) Summary: MLModel loads much faster when compute units are set to CPU only. It seems when loading with compute units set to all a large amount of preprocessing work is done during init. So, in order to speed up our effect load time, load a cpu MLModel synchronously, and a configured MLModel asyncronously. When the second model finishes loading about 600 ms later, swap the models out. So, for about half a second inference will occur on the cpu, but after that will kick over to gpu or npu. On iPhone 12 I'm seeing a > 10x improvement in load time as recorded by RenderTimeLogger.cpp Test Plan: - Add an override to https://www.internalfb.com/intern/qe2/ig_ios_person_segmentation_universe to opt into the coreml segmentation model - Launch IG camera and apply an effect that uses segmentation, such as green screen - Confirm that segmentation works. https://pxl.cl/277JL Reviewed By: kimishpatel Differential Revision: D37597965 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80941 Approved by: https://github.com/mcr229, https://github.com/kimishpatel

Author

jmdetloff

Committer

pytorchmergebot

Parents

2458b3cd

pytorch 665fd30a - Load CPU MLModel first, and configured MLModel async (#80941)

pytorch
665fd30a - Load CPU MLModel first, and configured MLModel async (#80941)