Enable static memory planning for pipeline. (#4204)
* Enable static memory planning for pipeline.
1. We fix a bug when resolving symbolic shape for scalars.
2. We pass the original inputs to all pipeline stages so that
the symbolic shapes can be resolved.
* Further Improvements
1. Address comments.
2. Further reduce activation size by ~50% when pipeline is on.
This is done by removing all but one gradient tensor from the last
RecordEvent in the backward pass.
* Address a comment
* Fix Windows build