pytorch
82cb7210 - [PyTorch] Fix record function inputs_valid_ check (#78002)

Commit
2 years ago
[PyTorch] Fix record function inputs_valid_ check (#78002) Summary: I think this has to be set in all before() calls. Because by default inputs_valid_ = false;. For RecordFunction without any input parameters (so this interface is not used), calling record_func.inputs() will cause an assert: ``` fbcode/caffe2/aten/src/ATen/record_function.h 322 TORCH_INTERNAL_ASSERT_DEBUG_ONLY(inputs_valid_, "Called inputs() outside RecordFunction start callback"); ``` I suppose, an alternative is to require users to call num_inputs() to check before calling inputs(). But I think the intent of inputs_valid_ is for verifying inputs are being requested within the lifetime of the start callback. Test Plan: After this diff fix: ``` => buck build mode/dev-nosan caffe2/test:profiler --show-output => buck-out/gen/caffe2/test/profiler#binary.par test_profiler.TestRecordFunction.test_record_function test_record_function (test_profiler.TestRecordFunction) ... Log file: /tmp/libkineto_activities_763488.json INFO:2022-05-20 13:02:33 763488:763488 Config.cpp:470] Trace start time: 2022-05-20 13:02:48 Trace duration: 500ms Warmup duration: 5s Max GPU buffer size: 128MB Enabled activities: cpu_op,user_annotation,external_correlation,cuda_runtime,cpu_instant_event,python_function Manifold bucket: gpu_traces Manifold object: tree/traces/clientAPI/0/1653076953/devgpu040.ftw6/libkineto_activities_763488.json Trace compression enabled: 1 INFO:2022-05-20 13:02:33 763488:763488 CuptiActivityProfiler.cpp:559] Enabling GPU tracing INFO:2022-05-20 13:02:33 763488:763488 CuptiActivityProfiler.cpp:486] Running child profiler CuptiRangeProfiler for 500 ms INFO:2022-05-20 13:02:33 763488:763488 CuptiActivityProfiler.cpp:593] Tracing starting in 14s INFO:2022-05-20 13:02:33 763488:763488 CuptiActivityProfiler.cpp:596] Tracing will end in 15s WARNING: Logging before InitGoogleLogging() is written to STDERR I0520 13:02:34.013141 763488 Logger.cpp:2273] Dropping logs in unit tests. Set shouldLogDuringTests=True in your CBLC to fix this STAGE:2022-05-20 13:02:33 763488:763488 ActivityProfilerController.cpp:269] Completed Stage: Warm Up STAGE:2022-05-20 13:02:34 763488:763488 ActivityProfilerController.cpp:275] Completed Stage: Collection INFO:2022-05-20 13:02:34 763488:763488 CuptiActivityProfiler.cpp:134] Processing 1 CPU buffers INFO:2022-05-20 13:02:34 763488:763488 CuptiActivityProfiler.cpp:771] Traces Recorded: INFO:2022-05-20 13:02:34 763488:763488 CuptiActivityProfiler.cpp:774] PyTorch Profiler: 1 iterations ok ---------------------------------------------------------------------- Ran 1 test in 0.060s OK ``` New test failure case: ``` => buck-out/gen/caffe2/test/profiler#binary.par test_profiler.TestRecordFunction.test_record_function test_record_function (test_profiler.TestRecordFunction) ... Log file: /tmp/libkineto_activities_808629.json INFO:2022-05-20 13:04:46 808629:808629 Config.cpp:470] Trace start time: 2022-05-20 13:05:01 Trace duration: 500ms Warmup duration: 5s Max GPU buffer size: 128MB Enabled activities: cpu_op,user_annotation,external_correlation,cuda_runtime,cpu_instant_event,python_function Manifold bucket: gpu_traces Manifold object: tree/traces/clientAPI/0/1653077086/devgpu040.ftw6/libkineto_activities_808629.json Trace compression enabled: 1 INFO:2022-05-20 13:04:46 808629:808629 CuptiActivityProfiler.cpp:559] Enabling GPU tracing INFO:2022-05-20 13:04:46 808629:808629 CuptiActivityProfiler.cpp:486] Running child profiler CuptiRangeProfiler for 500 ms INFO:2022-05-20 13:04:46 808629:808629 CuptiActivityProfiler.cpp:593] Tracing starting in 14s INFO:2022-05-20 13:04:46 808629:808629 CuptiActivityProfiler.cpp:596] Tracing will end in 15s WARNING: Logging before InitGoogleLogging() is written to STDERR I0520 13:04:46.853253 808629 Logger.cpp:2273] Dropping logs in unit tests. Set shouldLogDuringTests=True in your CBLC to fix this STAGE:2022-05-20 13:04:46 808629:808629 ActivityProfilerController.cpp:269] Completed Stage: Warm Up W0520 13:04:48.126065 808629 record_function.cpp:470] Exception in RecordFunction callback: inputs_valid_ INTERNAL ASSERT FAILED at "caffe2/aten/src/ATen/record_function.h":322, please report a bug to PyTorch. Called inputs() outside RecordFunction start callback Exception raised from inputs at caffe2/aten/src/ATen/record_function.h:322 (most recent call first): # 0 c10::get_backtrace[abi:cxx11](unsigned long, unsigned long, bool) # 1 std::_Function_handler<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (), c10::(anonymous namespace)::GetFetchStackTrace()::$_1>::_M_invoke(std::_Any_data const&) # 2 c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) # 3 c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) # 4 c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, char const*) # 5 torch::profiler::impl::ThreadLocalSubqueue::begin_op(at::RecordFunction const&, unsigned long) # 6 std::unique_ptr<at::ObserverContext, std::default_delete<at::ObserverContext> > torch::autograd::profiler::(anonymous namespace)::onFunctionEnter<false>(at::RecordFunction const&) # 7 at::RecordFunction::runStartCallbacks() ``` Differential Revision: D36556512 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78002 Approved by: https://github.com/swolchok
Author
Committer
Parents
Loading