Fix gradient buffer access for DeepCompile Z1/2 (#7548)
The initialization of DeepCompile+Z1/2 now fails due to the change
introduced in #7509.
This PR resolves the issue by:
- Adding an argument to optimizer.get_flat_partition
- Skipping the entire allreduce function in the engine
---------
Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com>