pytorch
c44ae554 - Skip the source info in the error report if the source code is too large (#105608)

Commit View On GitHub

Commit

1 year ago

Skip the source info in the error report if the source code is too large (#105608) Summary: A small model (<100MB) took about 20mins to load, and consume 16GB memory. Strobelight profiling: https://fburl.com/strobelight/abwtz0ry We realized that calc_line_start_offsets is culprit, and the line_starting_offsets_ is a vector of line numbers. There are >20000 places we generate such ErrorReport, and the line number is ~100000. So total memory cost is about 100000 x 20000 x 8 = ~16GB. We propose to skip the error info for extreme large source file (>1MB). And keep an environment variable to keep the ability to print the source code info for large source file. Test Plan: buck run mode/opt-split-dwarf scripts/lufang:load_pt_model -- --model_file_path=/data/local/models/961746678/2/961746678_2.predictor.disagg.gpu.local before the change, it takes 20mins to load, and the model costs 16GB memory (the model itself is only <100MB) after the change, it takes 15s to load. The most of the time / space is spent on calc_line_start_offsets, https://fburl.com/code/2to60zqu Differential Revision: D47610805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105608 Approved by: https://github.com/hl475

Author

houseroad

Committer

pytorchmergebot

Parents

e3539a0e

pytorch c44ae554 - Skip the source info in the error report if the source code is too large (#105608)

Commit

pytorch
c44ae554 - Skip the source info in the error report if the source code is too large (#105608)