Squashed commit of the following:

Commit

4 years ago

Squashed commit of the following: commit 4cf03b29b7d39374ec2c424fd85c1815b4b27eb6 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Tue Apr 28 10:16:53 2020 -0700 replace hypot with sqrt commit 4cc4837dae4d14efd133b51fcc796a1f2d14771f Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 13:00:03 2020 -0700 revert white space change commit 0683bf1b743babd8507d985b7a1096962301a184 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:57:14 2020 -0700 fix copy commit b9e08bd42580d21acc2263e0212eb599612d7521 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:39:27 2020 -0700 remove include of complex commit 13d3df4816bec7a23469e088df79b717c209373a Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:38:32 2020 -0700 fix scalar constructor commit 2f5293d4bee8f1af7513adda922e8fe6961a2774 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:27:18 2020 -0700 resolve review nits commit 8bf035de3430a8af7837f43c392bd2e48136d2a6 Merge: a15f4d546c 201ba13911 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:23:15 2020 -0700 Merge branch 'master' of github.com:pytorch/pytorch into hacking-dispatch commit a15f4d546c5c3e5633fbad9c1d6d83a321ba86d0 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:16:44 2020 -0700 revert all wrap changes commit aac470d29e00f3bceedd2ec36c224ab4e75b34ab Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 12:06:54 2020 -0700 fix commit 3c5dd3130ef249222eb78f526c6ac59fb237de9d Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 11:35:38 2020 -0700 revert white space change commit 285d7c7d63f19d4a476511555505e734c0d47223 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 11:19:57 2020 -0700 fix warning commit 38fe795e80097bd7ce5638e2203978db4a8fefdb Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 11:17:27 2020 -0700 USE_C10_COMPLEX commit 201ba139115adeecae4f094a9c9790200e53ff99 Author: Parth Agarwal <iparthagarwal@gmail.com> Date: Mon Apr 27 11:11:35 2020 -0700 Correct $ANDROID_HOME string empty check (#37064) Summary: Updated file to correct shell code to test whether $ANDROID_HOME env variable is empty or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37064 Differential Revision: D21181787 Pulled By: IvanKobzarev fbshipit-source-id: 40c1d79d0fb730c7f68aa7472ce9b2398e91f2a2 commit 805c417ec94ad13b3d974a6f23d85bf69e9ffdb5 Author: Xiao Wang <24860335+xwang233@users.noreply.github.com> Date: Mon Apr 27 10:59:32 2020 -0700 Implement avg_pool2d kernel for channels_last (#35855) Summary: Implement avg_pool2d for channels_last. This will close https://github.com/pytorch/pytorch/issues/34996. Performance compared with **avg_pool2d** contiguous can be found at https://github.com/xwang233/code-snippet/blob/ed6617c6bc48dac5757d9a1ca6f5db5a68e5d01b/avg-pool2d-channels-last/avg-pool2d-naive.ipynb cc csarofeen ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/35855 Differential Revision: D21187360 Pulled By: VitalyFedyunin fbshipit-source-id: b654b56168bc3982be306b634c7ed2f92018a9e5 commit ec8006cc1635a088aae36aa9263bc85140d9aa6e Author: mattip <matti.picus@gmail.com> Date: Mon Apr 27 10:58:01 2020 -0700 [ONNX] fix provider_version and add consistency test (#36797) Summary: forward port the test from pr gh-36795, xref issue gh-32561 Pull Request resolved: https://github.com/pytorch/pytorch/pull/36797 Differential Revision: D21257034 Pulled By: ezyang fbshipit-source-id: d217da0e74f00a433c904defc0bf3eb5f594fd5e commit 0048243f70f37a3ae74725fb21c88704d3ab62bb Author: Lukas Koestler <lkskstlr@gmail.com> Date: Mon Apr 27 10:46:07 2020 -0700 Check compiler -v to determine compiler (fix #33701) (#37293) Summary: As described in the issue (https://github.com/pytorch/pytorch/issues/33701) the compiler check for building cpp extensions does not work with ccache. In this case we check compiler -v to determine which compiler is actually used and check it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37293 Differential Revision: D21256913 Pulled By: ezyang fbshipit-source-id: 5483a10cc2dbcff98a7f069ea9dbc0c12b6502dc commit 6d409481b38692926930278002b50d2075396557 Author: Gao, Xiang <qasdfgtyuiop@gmail.com> Date: Mon Apr 27 10:29:07 2020 -0700 Add overloads of std:: math functions for c10::complex (#35725) Summary: Issue: https://github.com/pytorch/pytorch/issues/35284 ~This depends on and contains https://github.com/pytorch/pytorch/pull/35524. Please review after the dependency gets merged and I will rebase to get a clean diff.~ The implementation of most functions follow the pattern ```C++ template<typename T> C10_HOST_DEVICE c10::complex<T> some_function(c10::complex<T> x) { #if defined(__CUDACC__) || defined(__HIPCC__) return static_cast<c10::complex<T>>(thrust::some_function(static_cast<thrust::complex<T>>(x))); #else return static_cast<c10::complex<T>>(std::some_function(static_cast<std::complex<T>>(x))); #endif } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/35725 Differential Revision: D21256854 Pulled By: ezyang fbshipit-source-id: 2112ba6b79923450feafd7ebdc7184a3eaecadb6 commit a08a9f3b8222bc438ebdac86ecc44c1793d83c6b Author: Ryad ZENINE <r.zenine@gmail.com> Date: Mon Apr 27 10:19:23 2020 -0700 Enable uint8 upsampling 2 (#35029) Summary: Hi everyone, This is a supper small PR to enable `unit8` support for `nearest` up-sampling in `cpu` and `cuda`. This works enables us to move forward with the support of 'uint8' images in 'torchvision`. See impacted issues : https://github.com/pytorch/vision/issues/1375 https://github.com/pytorch/vision/issues/1179#issuecomment-558197607 Note: I wanted to add a unit test to ensure we have the expected behavior. I could not locate the `upsampling` unit tests for `nearest`. I can add the test if you point me to the right location. Thanks Pull Request resolved: https://github.com/pytorch/pytorch/pull/35029 Reviewed By: cpuhrsch Differential Revision: D21227144 Pulled By: fmassa fbshipit-source-id: 33c4b5188dedd8f7f872e9d797e2a9b58ee7315c commit 5c9d1e48242587a9b1958df2d2efea3472072f4f Author: Xingying Cheng <xcheng16@fb.com> Date: Mon Apr 27 10:16:59 2020 -0700 Propagate module lints for mobile scripted module. (#37046) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37046 ghstack-source-id: 102669259 Creating a python api entry to generate mobile model lints which takes a scripted module as argument and returns a map of module lints. The initial version is to create placeholder which included module bundled input as the first lint instance. More lints will be added in the future. Test Plan: python test/test_optimizer.py Reviewed By: dreiss Differential Revision: D21164648 fbshipit-source-id: 9e8f4e19d74b5464a55cc73b9dc18f358c5947d6 commit 5b9f7f7b0e205a6d8d5f2e61f558eee378f0ce40 Author: Mo Zhou <cdluminate@gmail.com> Date: Mon Apr 27 09:34:52 2020 -0700 [cmake] Add USE_SYSTEM_{GLOO,FP16,PTHREADPOOL,PSIMD,FXDIV,BENCHMARK} options (#14699) (#37277) Summary: These options are disabled by default, and are supposed to be used by linux distro developers. With the existing shortcut option USE_SYSTEM_LIBS toggled, these new options will be enabled as well. Additionally, when USE_SYSTEM_LIBS is toggled, setup.py should no longer check the existence of git submodules. ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/37277 Differential Revision: D21256999 Pulled By: ezyang fbshipit-source-id: 84f97d008db5a5e41a289cb7bce94906de3c52cf commit 3a0ff3cd2f04fcf3d4f6d152ab0772f048375cb6 Author: peterjc123 <peterghost86@gmail.com> Date: Mon Apr 27 08:28:56 2020 -0700 Generate environment restore script for Windows build jobs (#37319) Summary: for better debugging purposes Pull Request resolved: https://github.com/pytorch/pytorch/pull/37319 Differential Revision: D21257011 Pulled By: ezyang fbshipit-source-id: 41c7f1aa440f3ea626536b64392cca32f7c32dd3 commit 007163407cd68a5131c159c2944bfb772ec913d4 Author: Mo Zhou <cdluminate@gmail.com> Date: Mon Apr 27 08:14:39 2020 -0700 [cmake] Support "Generic" BLAS (#14699) (#37276) Summary: The "Generic" BLAS refers to the Netlib BLAS. This option is meaningful to the Debian family due to the "update-alternatives" mechanism, which enables the user to switch the libblas.so providers between different implementations at runtime, such as ATLAS, OpenBLAS, and Intel MKL. Such, building against generic BLAS provides much flexibility. This new option is not documented in setup.py because it's only supposed to be used by linux distro (especially Debian family) developersonly. ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/37276 Differential Revision: D21256877 Pulled By: ezyang fbshipit-source-id: 55a5356653a1cfc763a5699b04afe5938f2007ec commit 22ac071d9a173ea2358dd7c88a3a47fb1e2a2fe1 Author: Pavel Izmailov <izmailovpavel@gmail.com> Date: Mon Apr 27 07:39:50 2020 -0700 Add SWA to PyTorch mainline (#35032) Summary: This PR is based on the issue https://github.com/pytorch/pytorch/issues/29994#issue-524418771 and the discussion in the previous version of the PR https://github.com/pytorch/pytorch/pull/30559. Specifically, I followed the interface outlined in this [comment](https://github.com/pytorch/pytorch/pull/30559#issuecomment-574864768). ## Structure - `torch/optim/swa_utils.py` contains the implementation of `AveragedModel` class, `SWALR` learning rate scheduler and `update_bn` utility - `test/test_optim.py` contains unit tests for the three components of SWA - `torch/optim/swa_utils.pyi` describes the interface of `torch/optim/swa_utils.py` The new implementation consists of - `AveragedModel` class; this class creates a copy of a given model and allows to compute running averages of the parameters. - `SWALR` learning rate scheduler; after a certain number of epochs switches to a constant learning rate; this scheduler is supposed to be chained with other schedulers. - `update_bn` utility; updates the Batch Normalization activation statistics for a given model and dataloader; this utility is meant to be applied to `AveragedModel` instances. For `update_bn` I simplified the implementation compared to the [original PR](https://github.com/pytorch/pytorch/pull/30559) according to the sugestions by vadimkantorov. ## Example ```python loader, optimizer, model = ... swa_model = torch.optim.swa_utils.AveragedModel(model) # You can use custom averaging functions with `avg_fun` parameter ema_avg = lambda p_avg, p, n_avg: 0.1 * p_avg + 0.9 * p ema_model = torch.optim.swa_utils.AveragedModel(model, avg_function=ema_avg) scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=300) swa_start = 160 swa_scheduler = SWALR(optimizer, start_epoch=swa_start, swa_lr=0.05) for i in range(300): for input, target in loader: optimizer.zero_grad() loss_fn(model(input), target).backward() optimizer.step() scheduler.step() swa_scheduler.step() if i > swa_start: swa_model.update_parameters(model) # Update bn statistics for the swa_model at the end torch.optim.swa_utils.update_bn(loader, swa_model) ``` UPDATED: ```python3 loader, optimizer, model, loss_fn = ... swa_model = torch.optim.swa_utils.AveragedModel(model) scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=300) swa_start = 160 swa_scheduler = SWALR(optimizer, swa_lr=0.05) for i in range(300): for input, target in loader: optimizer.zero_grad() loss_fn(model(input), target).backward() optimizer.step() if i > swa_start: swa_model.update_parameters(model) swa_scheduler.step() else: scheduler.step() # Update bn statistics for the swa_model at the end torch.optim.swa_utils.update_bn(loader, swa_model) ``` Fixes https://github.com/pytorch/pytorch/issues/29994 cc soumith vincentqb andrewgordonwilson vadimkantorov Pull Request resolved: https://github.com/pytorch/pytorch/pull/35032 Differential Revision: D21079606 Pulled By: vincentqb fbshipit-source-id: e07f5e821f72ada63789814c2dcbdc31f0160c37 commit 828d590b06109f1ed1ab5d5e7fc6601aae4af198 Author: Jeff Daily <jeff.daily@amd.com> Date: Mon Apr 27 06:48:05 2020 -0700 [ROCm] Update to ROCm 3.3 (#37247) Summary: CC ezyang . ROCm 3.3 packages went live on 2020-04-01. Tag 376 was pushed on 2020-04-15, so it should be based on ROCm 3.3. The upgrade to ROCm 3.3 is required as part of the effort to stabilize ROCm CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37247 Differential Revision: D21256198 Pulled By: ezyang fbshipit-source-id: 92ac21c0122eda360ec279d2c3d462c3e6bf4646 commit f41742ff2fd5c9507c037dc120d75f6f191a87b1 Author: Wanchao Liang <wanchaol@users.noreply.github.com> Date: Sun Apr 26 22:18:55 2020 -0700 [autograd] remove spinning for dist engine (#36606) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36606 This PR refactor the continuation logic of the async mode on autograd engine, to avoid launch spinning works. To achieve that: 1. remove the continuation logic in execute_graph_task_with_continuiation 2. separate the usage of execute_graph_task between dist_engine and local engine, now dist_engine universally use `execute_graph_task_until_ready_queue_empty` (a better name appreciated here). 3. remove enqueue_blocked_task_on_cpu 4. remove the async mode in `execute_with_graph_task` as we don't need to use it in dist_engine Test Plan: Imported from OSS Differential Revision: D21032731 Pulled By: wanchaol fbshipit-source-id: 708ea3bc14815bdc151b56afa15eb85b4ac0f4b1 commit ed9ec3c96fdc9656c5bac144887c312a0168469e Author: Wanchao Liang <wanchaol@users.noreply.github.com> Date: Sun Apr 26 22:18:55 2020 -0700 [autograd] refactor some functions (#37061) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37061 This PR refactors: 1. `set_device` to make it out of Engine 2. put `graph_task_completed` into GraphTask 3. put `mark_graph_task_completed` into GraphTask This also make the distributed engine easy to call those functions. Test Plan: Imported from OSS Differential Revision: D21188688 Pulled By: wanchaol fbshipit-source-id: f56106e6ed7d966cfa4d962781c7865cc3c5321d commit 47fec01c45c696e247aff9e910f29a9586ae0869 Author: lixinyu <lixinyu@devgpu175.prn2.facebook.com> Date: Sun Apr 26 10:57:53 2020 -0700 Fix cpp extension compile failure on some envs (#37221) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37221 Test Plan: Imported from OSS Differential Revision: D21226873 Pulled By: glaringlee fbshipit-source-id: 0a390bbeaf153ee5ec355943f92c2dbcc5e04b59 commit b428f454e13f6e8055124ea19c32b554017137d0 Author: Mike Ruberry <mruberry@fb.com> Date: Sun Apr 26 04:25:28 2020 -0700 Revert D18927220: if_constexpr for C++14 Test Plan: revert-hammer Differential Revision: D18927220 Original commit changeset: 19a135e00af6 fbshipit-source-id: a1b8755a27903b98b742881b3ecce4f5e99543b2 commit b64fc3c4b5d927928770f9b343eb845123367084 Author: Mike Ruberry <38511765+mruberry@users.noreply.github.com> Date: Sat Apr 25 21:16:50 2020 -0700 Changes warnings generated in cpp to show point of Python origination (#36052) Summary: Today in PyTorch, warnings triggered in C++ are printed to Python users like this: `../aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.` This may be unhelpful to Python users, who have complained it's difficult to relate these messages back to their programs. After this PR, warnings that go through the PyWarningHandler and allow it to add context print like this: ``` test/test_torch.py:16463: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead. (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:81.) cpu_result = getattr(cpu_tensor, op_str)(*cpu_args) ``` This relates the warning back to the user's program. The information about the cpp file and line number is preserved in the body of the warning message. Some warnings, like those generated in the JIT, already account for a user's Python context, and so they specify that they should be printed verbatim and are unaffected by this change. Warnings originating in Python and warnings that go through c10's warning handler, which prints to cerr, are also unaffected. A test is added to test_torch.py for this behavior. The test relies on uint8 indexing being deprecated and its warning originating from its current header file, which is an unfortunate dependency. We could implement a `torch.warn` function, instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36052 Differential Revision: D20887740 Pulled By: mruberry fbshipit-source-id: d3515c6658a387acb7fccaf83f23dbb452f02847 commit f8ec51bd865bb488dc0c30f1e970c5dc49ce4727 Author: Peter Bell <peterbell10@live.co.uk> Date: Sat Apr 25 20:55:28 2020 -0700 Ensure DataParallel replicas can be saved (#37307) Summary: Fixes https://github.com/pytorch/pytorch/issues/37182 The `zero_grad` wrapper from `_replicate_for_data_parallel` can't be pickled. So instead, I set an attribute `_is_replica = True` and check for this in `Module.zero_grad`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37307 Differential Revision: D21246119 Pulled By: mrshenli fbshipit-source-id: 4755786d48a20bc247570ba672de9dd526914ce1 commit 2b050371b4cecd9c12b5f763e6867ff1c1019aab Author: Omkar Salpekar <osalpekar@fb.com> Date: Sat Apr 25 20:11:33 2020 -0700 Make listenLoopInternal non-virtual (#37265) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37265 In PGA, `listenLoopInternal` should not be virtual - PGA doesn't have any child classes that override this. Re-arranged some comments for `listenLoop` as well. ghstack-source-id: 102880792 Test Plan: Sandcastle/CI Differential Revision: D21238761 fbshipit-source-id: 5ec5058bc462182cf970faca9a734c11c7be2a32 commit d98ea604f4c31f86b2afe1afd96f283ef77c4da2 Author: Omkar Salpekar <osalpekar@fb.com> Date: Sat Apr 25 19:22:51 2020 -0700 Improve Error Message for Dist Autograd Context Cleanup Failure (#37255) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37255 Improved error message logged when Distributed Autograd Context cleanup fails - added node information and underlying error. The previous error message also assumed that the cause of the error was due to too many RPC's failing, but this is not necessarily the case. ghstack-source-id: 102867620 Test Plan: Ensuring Sandcastle/CI tests pass. Verified the correct message is logged when this code path is executed in `test_backward_node_failure` and `test_backward_node_failure_python_udf` . Differential Revision: D20950664 fbshipit-source-id: 267318187b7ef386930753c9679a5dfab6d87018 commit b198796a2810ebd7fdefec3389c17be47ba6a6ce Author: Zafar <cc.rafaz@zafar.cc> Date: Sat Apr 25 18:19:03 2020 -0700 [quant] quantized reflection_pad1d (#36450) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36450 Test Plan: Imported from OSS Differential Revision: D20984967 Pulled By: z-a-f fbshipit-source-id: 4731f16ba05a6aa57636d9ab85f12dfdeebcf08d commit 7604f470ed083d55c6a25bee3f995c7e71ea488f Author: Yinghai Lu <yinghai@fb.com> Date: Sat Apr 25 18:05:21 2020 -0700 Add weight info in debug_ssa_net (#37262) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37262 It's convenient to have weights info in the debug_ssa_net so that we can tell what is weight and what is primary inputs. We can get their shape and size info with some post-processing script easily. Reviewed By: ChunliF Differential Revision: D21237537 fbshipit-source-id: 1fadc605283ef2eed78c44494e062a16ccf135ab commit 92e91cee8dc9d78314308ace125022835fcbc0c9 Author: Ksenija Stanojevic <ksenija.stanojevic@gmail.com> Date: Sat Apr 25 17:54:57 2020 -0700 ONNX Export Support for CrossEntropyLoss (#34830) Summary: Add ONNX export support for torch.nn.CrossEntropyLoss. This PR makes following changes: 1. Updates nll_loss export 2. Makes a post pass for SoftmaxCrossEntropy Pull Request resolved: https://github.com/pytorch/pytorch/pull/34830 Reviewed By: hl475 Differential Revision: D21230712 Pulled By: houseroad fbshipit-source-id: c81911a41968e23813ba10274340ce4d8ba1ed78 commit 205c6ffbc5febd27b810c37e1bfae50b9655f8e4 Author: Zafar <cc.rafaz@zafar.cc> Date: Sat Apr 25 17:04:23 2020 -0700 [quant] Generalizing _calculate_dynamic_qparams in quantized test (#36449) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36449 Test Plan: Imported from OSS Differential Revision: D20984966 Pulled By: z-a-f fbshipit-source-id: 17437297adae813bc5c6fa43c6c7514f72ce2f6c commit ca39f99d48a6fc43384a86ecf745df40f038d21f Author: Haixin Liu <haixin@fb.com> Date: Sat Apr 25 16:44:13 2020 -0700 [Pytorch Numeric Suite] Add module level comparison (#37242) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37242 Add module level comparison API. ghstack-source-id: 102853727 Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub' Reviewed By: raghuramank100 Differential Revision: D21232277 fbshipit-source-id: de707eea101a66a37869129460274c56e4e07db2 commit a04022c656516c08e3719628f39ac47a9328155a Author: Nikita Shulga <nshulga@fb.com> Date: Sat Apr 25 15:53:00 2020 -0700 Use `std::chrono::high_resolution_clock` for profiling on Mac (#37280) Summary: According to Darwin man-page: `CLOCK_REALTIME` the system's real time (i.e. wall time) clock, expressed as the amount of time since the Epoch. This is the same as the value returned by `gettimeofday`(2). I.e. its returns timestamp with microsecond resolution, as can be obvserved by running following small program: ``` #include <sys/time.h> #include <stdint.h> #include <stdbool.h> #include <stdio.h> bool conseq_time(clockid_t c) { struct timespec t1, t2; clock_gettime(c, &t1); clock_gettime(c, &t2); printf("t1={.tv_sec=%ld, .tv_nsec=%ld}\n", t1.tv_sec, t1.tv_nsec); printf("t2={.tv_sec=%ld, .tv_nsec=%ld}\n", t2.tv_sec, t2.tv_nsec); bool rc = t1.tv_sec == t2.tv_sec && t1.tv_nsec == t2.tv_nsec; printf("Two timestamps are %sequal\n", rc ? "" : "not "); return rc; } int main(void) { printf("using CLOCK_REALTIME\n"); conseq_time(CLOCK_REALTIME); printf("using CLOCK_MONOTONIC_RAW\n"); conseq_time(CLOCK_MONOTONIC_RAW); return 0; } ``` which if compiled outputs something like: ``` using CLOCK_REALTIME t1={.tv_sec=107519, .tv_nsec=860315000} t2={.tv_sec=107519, .tv_nsec=860315000} Two timestamps are equal using CLOCK_MONOTONIC_RAW t1={.tv_sec=107520, .tv_nsec=954297363} t2={.tv_sec=107520, .tv_nsec=954297426} Two timestamps are not equal ``` But why do it, if all this platform specific logic is already nicely abstracted in `std::chrono::`: https://github.com/llvm/llvm-project/blob/master/libcxx/src/chrono.cpp#L117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37280 Differential Revision: D21246608 Pulled By: malfet fbshipit-source-id: 6beada30657a2720000e34214b1348112e55be50 commit 59052e39b8daa12a7243ac9e0bbd6714a4fdb861 Author: Zafar <cc.rafaz@zafar.cc> Date: Sat Apr 25 15:50:38 2020 -0700 [quant] qtensor resize (#36442) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36442 Test Plan: Imported from OSS Differential Revision: D20984080 Pulled By: z-a-f fbshipit-source-id: 7fcf24bd2f92f038b670f510118b012d8c7acc74 commit bf860a4ebafbfcb75e61a8603bded72f6d0b0970 Author: Mike Ruberry <38511765+mruberry@users.noreply.github.com> Date: Sat Apr 25 15:34:39 2020 -0700 Adds missing documentation . (#37295) Summary: Fixes torch.isclose documentation missing a `.`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37295 Differential Revision: D21245426 Pulled By: mruberry fbshipit-source-id: 88ce57ed68c2eac6aa83932780a6ba30e9fa69ea commit 34284c127930dc12d612c47cab44cf09b432b522 Author: Raghuraman Krishnamoorthi <raghuraman@fb.com> Date: Sat Apr 25 14:50:40 2020 -0700 Fix NaN error in dynamic quantization in qLinear, re-enable test_quantized_rnn (#36009) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36009 When scale is very small (less than float eps, but greater than minimum double precision value), computation of reciprocal of scale in floating point precision within FBGEMM returns inf, while QuantUtils does not. Changed computation in QuantUtils to occur with floating point precision to re-enable tests. ghstack-source-id: 102896302 Test Plan: buck test caffe2/test:quantization -- 'test_quantized_rnn $quantization\.test_quantization\.PostTrainingDynamicQuantTest$' --print-passing-details --run-disabled Summary (total time 59.91s): PASS: 1 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 Differential Revision: D20853000 fbshipit-source-id: 948a888f5516b3ba9c6efb7de31ef2cc9d431991 commit 84a31fb4e7fb1b5dbe9e42f5e1e30be4a0440189 Author: Mike Ruberry <mruberry@fb.com> Date: Sat Apr 25 14:20:33 2020 -0700 Revert D18927221: Boxing uses if_constexpr instead of SFINAE Test Plan: revert-hammer Differential Revision: D18927221 Original commit changeset: 70d99025b45e fbshipit-source-id: a4b650bbb6d76dda6086d88eb554f3c3077b0f76 commit c90955e3d12391bb7ad22fb9a22eba8f768267a4 Author: James Reed <jamesreed@fb.com> Date: Sat Apr 25 13:53:12 2020 -0700 [profiler] Sort by end interval as well when parsing CPU trace (#37297) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37297 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D21245463 Pulled By: jamesr66a fbshipit-source-id: 8d307eaa32fa960b93dfd9a3b0b4c767fd903094 commit ea741f829e825e8ff87ed67cd80a71d65fbb9c73 Author: Nikita Shulga <nshulga@fb.com> Date: Sat Apr 25 13:51:10 2020 -0700 Add `--repeat` option to python unit-test (#37281) Summary: This would run same testsuite (or individual test) multiple time Useful for detecting flaky tests Example usage: `python test_autograd.py TestAutograd.test_profiler -v --repeat=100` Pull Request resolved: https://github.com/pytorch/pytorch/pull/37281 Differential Revision: D21244442 Pulled By: malfet fbshipit-source-id: 3ecafec7ae87bc1e418aa28151bbc472ef37a713 commit 44345ad08c0aefcae400b948635f980c907f0f49 Author: Nikita Shulga <nshulga@fb.com> Date: Sat Apr 25 13:50:50 2020 -0700 Do not define C10_IOS on Mac (#37283) Summary: Because MacOS is not iOS Pull Request resolved: https://github.com/pytorch/pytorch/pull/37283 Test Plan: CI Differential Revision: D21244398 Pulled By: malfet fbshipit-source-id: b822e216e83887e2f2961b5c5384eaf749629f61 commit cb27067b321dacbc8fd94d9a4b85c62d4244edbf Author: Negin Raoof <neginmr@utexas.edu> Date: Sat Apr 25 12:21:03 2020 -0700 [ONNX] Remove inverse op (#37005) Summary: ONNX inverse op is being removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37005 Reviewed By: hl475 Differential Revision: D21230728 Pulled By: houseroad fbshipit-source-id: 7e10414918c57938cda4ca03875c070319d429fb commit b18f57e5480ce4461c7583d66188357c635e2cbc Author: Sebastian Messmer <messmer@fb.com> Date: Sat Apr 25 11:29:38 2020 -0700 Boxing uses if_constexpr instead of SFINAE (#31092) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31092 - ghstack-source-id: 102878439 Test Plan: unit tests Reviewed By: ezyang Differential Revision: D18927221 fbshipit-source-id: 70d99025b45edfaef11a0d587cf8bf8e749df6b8 commit f5e6f1f333b98a596daef9f277cb7f915de91c75 Author: Sebastian Messmer <messmer@fb.com> Date: Sat Apr 25 11:29:38 2020 -0700 if_constexpr for C++14 (#31091) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31091 This implements a C++17 "if constexpr" like feature for C++14. This can be used, for example, to replace SFINAE or to force the compiler to remove some parts of a function in the assembly based on a condition. PRs stacked on top will use this to simplify some of our template metaprogramming. ghstack-source-id: 102867141 Test Plan: unit tests Differential Revision: D18927220 fbshipit-source-id: 19a135e00af6ebb0139ce3730353762d4512158f commit 04b36fc264c63d31e481166c675935b1d99afc5e Author: Bram Wasti <bwasti@fb.com> Date: Sat Apr 25 09:59:06 2020 -0700 [TensorExpr] rfactor implementation (#36237) Summary: A similar interface to Halide's rfactor: https://halide-lang.org/tutorials/tutorial_lesson_18_parallel_associative_reductions.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/36237 Reviewed By: zheng-xq Differential Revision: D21233309 Pulled By: bwasti fbshipit-source-id: d2706a9e90b707ee195e339f834ff4a54b63a256 commit c52deb694ed9a5e18520a81be07a249fd9a70567 Author: Shen Li <cs.shenli@gmail.com> Date: Sat Apr 25 09:33:11 2020 -0700 Consolidate usage on torch::jit::toPyObject in RPC request_callback (#37249) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37249 Test Plan: Imported from OSS Differential Revision: D21234990 Pulled By: mrshenli fbshipit-source-id: d07210151342bd2ad12d1364d9f22817ee59b0c2 commit 3d934c3d36f8967d79016b36e3cc7b9c2ffa6821 Author: Shen Li <cs.shenli@gmail.com> Date: Sat Apr 25 09:33:11 2020 -0700 Add using torch::utils::Future to simplify code in RRefContext (#36811) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36811 Test Plan: Imported from OSS Differential Revision: D21093846 Pulled By: mrshenli fbshipit-source-id: 61a6b1483ef1533803a18bec216ebe82aa187458 commit 269ec9a139d381605fa898539670163a92d0d107 Author: Shen Li <cs.shenli@gmail.com> Date: Sat Apr 25 09:33:11 2020 -0700 Prevent RRef.to_here() to block an RPC thread on the callee using Future callbacks (#36805) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36805 Test Plan: Imported from OSS Differential Revision: D21093847 Pulled By: mrshenli fbshipit-source-id: 81b0934874af36e03329fe6176628e3aca12811f commit 6e1e55c1344400f1a38b3e2a2a40f96816cf81d3 Author: Shen Li <shenli@devfair017.maas> Date: Sat Apr 25 09:33:11 2020 -0700 Prevent RRef unpickle to block waiting for OwnerRRef creation (#36785) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36785 Currently, RRef unpickle (both Python and TorchScript) will block until the OwnerRRef has been created by the original `rpc.remote` call, if it is an OwnerRRef. This is not ideal as the correctness would then depends on the number of threads configuration. This commit changed that behavior. Both `rpc.remote` and the unpickle can create OwnerRRefs. More specifically, whichever one arrives first will create the OwnerRRef and the subsequent ones will retrieve the same OwnerRRef, so that no one is blocking. Test Plan: Imported from OSS Differential Revision: D21083089 Pulled By: mrshenli fbshipit-source-id: 34ef063d50549b01c968b47815c4fe9fac179d3d commit 8872e00e11926833c1d7d5a0578524f808ebd631 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Sat Apr 25 09:08:22 2020 -0700 fix type meta commit d7f7c290e3d76a1e3019166644baf78de0d95a31 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Sat Apr 25 07:40:50 2020 -0700 addmv migration [resubmit] (#37236) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37236 Differential Revision: D21232988 Pulled By: anjali411 fbshipit-source-id: ac6c0ee018aef3c841b039d76e6e1fbb3cd0292d commit b9a2c35fdf203680a98d30f9201dac2b50548157 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Sat Apr 25 01:31:32 2020 -0700 remove debug print commit cfd70207b1753648ee5c474bee9905b0543d4db4 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Sat Apr 25 01:30:24 2020 -0700 fix copy kernel commit b6eb2a5f73640518b40708d6356ccb06b243d8ba Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Sat Apr 25 01:18:44 2020 -0700 fix type in dispatch commit 856e8cf0288fe3c1701d11fae61b214c08635b9d Author: Ilia Cherniavskii <iliacher@fb.com> Date: Sat Apr 25 00:57:06 2020 -0700 Revert D21213786: Enable global observers API Test Plan: revert-hammer Differential Revision: D21213786 Original commit changeset: e618254da74a fbshipit-source-id: 425ea5d44fa55655ec0dd586c5075996b926177b commit e6231c9e24c05e435eeb9dfcd66247e4520c559a Author: Nikita Shulga <nshulga@fb.com> Date: Sat Apr 25 00:09:09 2020 -0700 Do not run valgrind on the Aten unit tests compiled with clang (#37152) Summary: Valgrind detects some unitialized variables if torch_cpu is compiled with clang, which are not reproducible if the same code is compiled with gcc nor using address sanitizer tool See https://github.com/pytorch/pytorch/issues/37117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37152 Differential Revision: D21241577 Pulled By: malfet fbshipit-source-id: 4a5dddf2a4fc4238dc9117cb92ee4e34af9e6064 commit c9b3d94a4dc571b9c711e7e8e6e378dbd78a2e0a Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 23:59:12 2020 -0700 fix to commit cd4688138bb52f36978ed6c9daab19ee864429b9 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 23:55:57 2020 -0700 fix to commit 6e659e928ba48afa8a6f5d734c37ab187734927b Author: Ilia Cherniavskii <iliacher@fb.com> Date: Fri Apr 24 23:47:33 2020 -0700 Enable global observers API (#37195) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37195 After adding c10::DispatchKey::Profiler the behavior of RecordFunction observers is also controlled by the dispatch key, this PR moves the logic outside of the profiler into the record function Reviewed By: ngimel Differential Revision: D21213786 fbshipit-source-id: e618254da74a4f1ce16c51a3869bbd75a4f561ad commit a0f1c2c97249b8c3e5fc9d537ebaee169cdab88a Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 23:25:53 2020 -0700 fix item commit 4e976b9334acbcaa015a27d56540cd2115c2639b Author: Sebastian Messmer <messmer@fb.com> Date: Fri Apr 24 23:08:18 2020 -0700 Remove callBoxedWorkaround (#36850) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36850 Since now all unboxing happens after dispatch, which means that all c10 ops support unboxing, we can now use op.callBoxed() for all ops and don't need callBoxedWorkaround (which was going through the JIT registry) anymore. ghstack-source-id: 102879558 Test Plan: waitforsandcastle Differential Revision: D21102375 fbshipit-source-id: d1e041116563a9650d5a86b07eb96d217d8756f3 commit 6efca91edcab0f258293467bc962c3fd1332f79a Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 22:48:46 2020 -0700 enable data_ptr for std::complex commit 6ea2aedab9da83a2dcf421880436c471ab40f0ec Author: Hong Xu <hong@topbug.net> Date: Fri Apr 24 22:35:24 2020 -0700 Cast shape_.size() to int64_t before comparing with squash_dim (#37109) Summary: This is generating a considerable amount of warning messages since TensorIterator.h is included from a lot of files: /home/hong/xusrc/pytorch/aten/src/ATen/native/TensorIterator.h:372:47: warning: comparison of integers of different signs: 'const int64_t' (aka 'const long') and 'c10::SmallVectorTemplateCommon::size_type' (aka 'unsigned long') [-Wsign-compare] TORCH_CHECK(squash_dim >= 0 && squash_dim < shape_.size(), Pull Request resolved: https://github.com/pytorch/pytorch/pull/37109 Differential Revision: D21242163 Pulled By: ngimel fbshipit-source-id: aec2978ee76750676a449eb6671142a782658de3 commit 35decf020b31c7ca25e5c65492046077ceb9f2cc Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 22:23:12 2020 -0700 type meta commit 30eb0bdf3257a62df303ab59991ad6eb784dd177 Author: Nikita Shulga <nshulga@fb.com> Date: Fri Apr 24 21:30:25 2020 -0700 Do not define list "0" in torch/CMakeLists.txt (#37275) Summary: Per https://cmake.org/cmake/help/latest/command/list.html list insert arguments order is `list(INSERT <list> <index> [<element>...])` That is first argument is list name not the index it gets inserted into Pull Request resolved: https://github.com/pytorch/pytorch/pull/37275 Differential Revision: D21243539 Pulled By: malfet fbshipit-source-id: b947ad64f1a3549df68083383537899b19abd9ca commit 904949382e36c282c547db545d98bde23553695f Author: Raghuraman Krishnamoorthi <raghuraman@fb.com> Date: Fri Apr 24 20:55:32 2020 -0700 Ensure that histogram observers have zero-point of zero for post ReLU activations (#37107) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37107 Currently histogram observers relax both the min and max values of the activations for performance speedup reasons. This causes an issue for glow where there is a slow down if the zero-point is not zero for post ReLU activations. ghstack-source-id: 102768017 Test Plan: buck test caffe2/test:quantization -- 'test_histogram_observer_one_sided $quantization\.test_quantization\.RecordHistogramObserverTest$' --print-passing-details Differential Revision: D21187636 fbshipit-source-id: 8d616b9e9caf2979a26a215e99434f71025e3d8b commit ef9ec03e770d36b7138189ac5a96515487902a2f Author: Xiaodong Wang <xdwang@fb.com> Date: Fri Apr 24 20:27:05 2020 -0700 [CUDA11] Pytorch change (#37187) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37187 Adding CUDACC guard for gcc9+ Reviewed By: ngimel Differential Revision: D21209798 fbshipit-source-id: 5cc4efc7108577d74bee4c12c942ed1e5bf9bbac commit 984f1ef2d6d967f3aca9b23f7b763f484260212b Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 20:25:48 2020 -0700 ident commit ea25b50901a5c56e8845ee364a95d82fc41f95e6 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 20:20:28 2020 -0700 save commit a80a438e3752d0f4b1820492e9d0051760b926bb Author: Nikolay Korovaiko <korovaikon@gmail.com> Date: Fri Apr 24 20:10:21 2020 -0700 correctly set and restore states in te tests (#37210) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37210 Differential Revision: D21238634 Pulled By: Krovatkin fbshipit-source-id: 6462239753399c10c871baa5d5fdff5465cf2544 commit 686b521784a869cd48a75a16fce38bc25560a2ef Author: Xiao Wang <24860335+xwang233@users.noreply.github.com> Date: Fri Apr 24 20:10:10 2020 -0700 Update cusparse deprecated Xcsrmm2 call (#37202) Summary: Reland of https://github.com/pytorch/pytorch/issues/36845 due to Windows CI failure. binary_windows_wheel_3_7_cu102_build is passed, so the windows guard should be fine this time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37202 Differential Revision: D21233358 Pulled By: xw285cornell fbshipit-source-id: 707de0ff21d178686354ffaea7625f1d68b3e8d3 commit 22e79aaaa4cd395eee9409d13dc64f7ce1f85b1e Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 20:01:02 2020 -0700 save commit 4a72ddedcd2c645bb8fd507b375a0a42483ad1e1 Author: Gao, Xiang <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 19:47:36 2020 -0700 Show cpu info for macos jobs (#37220) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37220 Differential Revision: D21243205 Pulled By: ezyang fbshipit-source-id: 77a4d904e80c59b6d4d39b1a1a0fb441d8a35f0c commit 1d0334dd62ae18c7fd0c9fa5d048bf4a796e0c16 Author: Yang Gu <yangu@microsoft.com> Date: Fri Apr 24 19:46:53 2020 -0700 Add cpu build and test to Windows CI (#37135) Summary: Add windows build and test for cpu Pull Request resolved: https://github.com/pytorch/pytorch/pull/37135 Differential Revision: D21243189 Pulled By: ezyang fbshipit-source-id: dd804ac258940e608facaf375d80ff5a0c59a7ae commit 4a05558bd9de93fd90b85099b9606a03f53cd163 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 19:37:14 2020 -0700 fix distribution commit 5b7d9817c35fa4fb6adf31929b41e019cbdf958e Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 19:23:37 2020 -0700 fix commit f71593ee31f2366dafd660b4bbcfb3086dea0e81 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 19:11:47 2020 -0700 fix commit ff19d415d769d8e12dbd06ba5aae5e9ea951179d Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 19:01:42 2020 -0700 fix comment commit 3ada82d2364ca43e77eb7f77ff8e43fb858f296e Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 18:56:29 2020 -0700 Automatically include c10/util/dont_wrap_complex.h commit 398608de9a4bb2ab17dbfa826f435091fc75c8d3 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 18:53:12 2020 -0700 fix commit 093564d6918242aff70cd281554ab0142e01751e Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 18:38:16 2020 -0700 fix commit f71f97e17a4f3a4abf75ccd9d2f32a6312d8ebc5 Merge: 626473f5fe 1d8012a624 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 18:22:23 2020 -0700 Merge branch 'master' of github.com:pytorch/pytorch into hacking-dispatch commit 626473f5fe717d106e4888b9afdb49cd38782d81 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 18:20:26 2020 -0700 Make c10::complex the C++ type for complex tensors commit 1d8012a624e4dbc9f66c7942e82e168707796855 Author: Sebastian Messmer <messmer@fb.com> Date: Fri Apr 24 18:05:47 2020 -0700 Delete dead code (#37254) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37254 This code is leftover from the KernelFactory deletion. ghstack-source-id: 102866045 Test Plan: waitforsandcastle Differential Revision: D21235480 fbshipit-source-id: 739ba677d2139ba9934d103f75a609638f1a3856 commit 1f08ff12ecd27cf18fe21cf1fcf90a1c824b3ff7 Author: Michael Suo <suo@fb.com> Date: Fri Apr 24 17:40:48 2020 -0700 [jit] fix named tuples as attributes (#37251) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37251 This was broken by recent changes to how we serialize with type tags. We save a name (like `Dict[str, MyNamedTuple]`) and then relied on the mobile type parser to resolve that name back into a set of types. This doesn't work for any NamedTypes as the mobile type parser doesn't know how to resolve those. The unpickler allows the caller to inject a type resolver in for this purpose, use that so that when importing in a non-mobile environment you get the right results. A second problem also had to be fixed: the SourceImporter type loader would only load named types directly (e.g. `MyNamedTuple`) and choked if it was a general type that contained a named tupe (e.g. `List[MyNamedTuple]`). Fixed that and renamed `loadNamedType` to `loadType` for clarity. Test Plan: Imported from OSS Differential Revision: D21235213 Pulled By: suo fbshipit-source-id: 16db0f4c5e91a890d67a8687cc8ababa6b94b0f4 commit 47c4dca1ab3fedfde7b1ce383e779454e7903e86 Author: Nikita Shulga <nshulga@fb.com> Date: Fri Apr 24 17:39:53 2020 -0700 Remove python-2 or python<3.5 checks from unit tests (#37252) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37252 Test Plan: CI Differential Revision: D21241083 Pulled By: malfet fbshipit-source-id: 44164b822f7905288abb2beda0175d2162d86143 commit 521910e0e97f6014c976cdab7dff024a038a0a76 Author: Michael Suo <suo@fb.com> Date: Fri Apr 24 17:17:39 2020 -0700 Update clang_format_ci.sh (#37268) Summary: shellcheck led me astray! Pull Request resolved: https://github.com/pytorch/pytorch/pull/37268 Differential Revision: D21241361 Pulled By: suo fbshipit-source-id: 68244bb889e784ccd36d714209c2c15e2d6f04f8 commit b60c3dfdd963cd5b0879d9fae5130fac3ed79bbf Author: James Reed <jamesreed@fb.com> Date: Fri Apr 24 16:22:25 2020 -0700 Add fallback wrapper for profiler (#37194) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37194 Test Plan: Imported from OSS Reviewed By: ilia-cher, ngimel Differential Revision: D21217886 Pulled By: jamesr66a fbshipit-source-id: b06195e9ac110979d128391e067d5c9f416c1873 commit 047488a7ffb42a4dad5c12992663738bd6c96004 Author: Basil Hosmer <bhosmer@fb.com> Date: Fri Apr 24 16:06:08 2020 -0700 Mask all high dispatch keys in BackendSelect kernels (#37257) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37257 Previously, we were relying on fragile invariants to avoid collecting and feeding high precedence, non-backend dispatch keys to backend initialization machinery, which would assert on them. (These same keys are then used for redispatch, so a second latent problem lurks behind the first.) Here we mask off the BackendDispatch key and all keys to its left. Followup: move backend init code to backend-specific wrappers (`CPUType` etc.). This will let us remove the backend init code from both BackendSelect and STATIC_DISPATCH wrappers. (Though BackendSelect will still need to compute a dispatch key, so the logic introduced here will still be necessary.) Test Plan: Imported from OSS Differential Revision: D21235856 Pulled By: bhosmer fbshipit-source-id: 1b8bd7897ed4b41a95718f3cfceddf4ee094744a commit b6bb644e41b3928b5a515330ad35c8b447fcb876 Author: Zachary DeVito <zdevito@fb.com> Date: Fri Apr 24 15:12:12 2020 -0700 Fix long line splitting issue in python_print (#37088) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37088 For an inlined expression tree like `(e_0, (e_1, e_long))` the previous algoritm only scanned the same statement as `e_long`, splitting the inlined expressions across lines. Because it did not scan `e_0`, `e_0` would still get emitted inline, causing it to reverse order with `e_1` and `e_long`. The new algorithm scans starting at `e_long` and going all the way back up the expression until it reaches the end of the inlined statement. Caching of what has already been scanned has been added so that if there was a second long long `e_long2` after `e_long`, it would not rescan and re-inline the statements that were already split. Test Plan: Imported from OSS Differential Revision: D21180394 Pulled By: zdevito fbshipit-source-id: 4d142c83a04c89a47d04282f67a513f82cf153c0 commit d6ce6570f96e8edbf450728a5bfa080f181bcba0 Author: Hong Xu <hong@topbug.net> Date: Fri Apr 24 15:08:39 2020 -0700 Remove unused imports in aten/src/ATen/function_wrapper.py (#37245) Summary: typing is available since Python 3.5, no need to try-import. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37245 Differential Revision: D21236650 Pulled By: albanD fbshipit-source-id: daf150103835d0c6cd3c39300044e548bb6d311d commit 4f3946a89b639f3b87c37b4190e2bc3dc22ee608 Author: anjali411 <chourdiaanjali123@gmail.com> Date: Fri Apr 24 15:03:38 2020 -0700 Added complex dtypes to get_all_math_dtypes, complex acc type for cpu, fixed rdiv and pow for complex (#37193) Summary: Resolves https://github.com/pytorch/pytorch/issues/36730 https://github.com/pytorch/pytorch/issues/36057 Partially resolves: https://github.com/pytorch/pytorch/issues/36671 ``` >>> 2j / torch.tensor([4], dtype = torch.complex64) tensor([(0.0000+0.5000j)], dtype=torch.complex64) >>> 1 / torch.tensor(3+4j) tensor((0.1200-0.1600j), dtype=torch.complex64) ``` rdiv is more generally broken for all dtypes because it doesn't promote the types properly eg. ``` >>> 1 / torch.tensor(2) tensor(0) >>> 2j / torch.tensor(4) tensor(0) ``` so that issue should be fixed in a separate PR Adding CPU acc types for complex Added cumsum, cumprod for complex dtypes Added complex dtypes to get_all_math_dtypes to expand testing for complex dtypes Old PR - https://github.com/pytorch/pytorch/pull/36747 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37193 Differential Revision: D21229373 Pulled By: anjali411 fbshipit-source-id: 8a086136d8c10dabe62358d276331e3f22bb2342 commit c38dcd45d70b2850047d9956e45ff3312966a078 Author: Wanchao Liang <wanchaol@users.noreply.github.com> Date: Fri Apr 24 14:45:11 2020 -0700 [jit] fix return different types bug in tracing module calls (#37190) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37190 if module call return different types, we need to record them correctly Test Plan: Imported from OSS Differential Revision: D21214871 Pulled By: wanchaol fbshipit-source-id: 46ba98f08ed4ade22f9740cb3fca84b29557e125 commit 5362a0b948450e2d2ba5f8ce2157a65b2f06b392 Author: Wanchao Liang <wanchaol@users.noreply.github.com> Date: Fri Apr 24 14:45:11 2020 -0700 [jit] fix lifting bug in tracing module calls (#37189) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37189 This fix bug in tracing module calls to correct lift values with its correponding value type, rather than the default tensor type. Test Plan: Imported from OSS Differential Revision: D21214872 Pulled By: wanchaol fbshipit-source-id: f635154851365e2d7b88186d6e47634123eac42f commit a13b5b0ae85ea6b9ba6038f99658a88039e23782 Author: Xiang Gao <qasdfgtyuiop@gmail.com> Date: Fri Apr 24 14:16:54 2020 -0700 Split reduction compile units (#37205) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37205 Test Plan: Imported from OSS Differential Revision: D21233254 Pulled By: ngimel fbshipit-source-id: 68b37ebbdd715a30c616e425a39b6b21c01b37e2

Author

zasdfgbnm

Parents

1416ab28

pytorch 61970d0b - Squashed commit of the following:

Commit

pytorch
61970d0b - Squashed commit of the following: