Pipeline: Add support to eval micro bs configuration (#4859)
When running evaluation the general memory consumption is reduced.
Mainly due to absence of gradients, and hanging FWD activations. It
allows to increase the micro-bs and improve the evaluation performance.
This commits add the option to pass num_micro_batches to eval_batch(),
as the current assumption is that same micro-bs and global-bs is used,
so same number micro batches will take place.
This commit also modifies _scale_loss_by_gas in runtime/engine.py to
consider number of eval micro batches for loss scaling instead of
training gas.
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>