Gradient Accumulation optimization verified for correctness (#8273)
* Fetching frontier tensors to frontend
* Move before session initialize call
* Fetch tensor and add to cache
* Rest of the changes for using cache
* Review comments
* Review changes
* Review comments
* switch to shared_ptr
* Fix bug after rebase
* FE docstring change