Save optimized pre_grad graph once ready (#16816)
### Save optimized pre_grad graph once it's ready
`graph_builder.build()` did two things for training: 1. optimized
forward graph, e.g. pre_grad graph optimization. 2. build gradient
graph.
Originally after `graph_builder.build()` completed, pre_graph graph is
saved. While if pre_grad graph optimization completed, but fail during
gradient graph build, we still cannot get pre_grad graph to investigate.
This PR made the change once pre_grad graph is ready, we save it (if
save_model is enabled) in C++ backend.