[AOTI][refactor] Move ThreadLocalCachedOutputTensor into a separate header (#119392)
Summary: Move common functionality into a separate header so that later JIT and AOT Inductor can share it.
Test Plan: CI
Differential Revision: D53523452
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119392
Approved by: https://github.com/khabinov