flax
9fdf59f0 - Fold optimizer step into dropout_rng instead of splitting the dropout_rng.

Commit
4 years ago
Fold optimizer step into dropout_rng instead of splitting the dropout_rng. We don't checkpoint the dropout_rng. Folding in the step allows us to have the same random sequence even if the job is interrupted. PiperOrigin-RevId: 349541996
Author
Committer
Parents
Loading