[nnc] Insert alloc/free at global scope (#61725)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61725
Alloc/free inside a loop isn't really an optimization, and furthermore
it breaks some attempted optimization in the llvm backend: we use alloca for
small allocations, which is efficient since alloca is on the stack, but there's
no corresponding free, so we leak tons of stack. I hit this while building an
rfactor buffer inside a very deeply nested loop.
ghstack-source-id: 133627310
Test Plan:
Unit test which simulates use of a temp buffer in a deeply nested
loop.
Reviewed By: navahgar
Differential Revision: D29533364
fbshipit-source-id: c321f4cb05304cfb9146afe32edc4567b623412e