BF16 optimizer: Clear lp grads after updating hp grads in hook (#5328)
This fix is to solve:
- Previous iteration's lp grads will still alive during the next
iteration's forward. This increases the memory footprint.
- The hook behavior is not aligned to its name
accumulate_hp_grads_and_remove_lp
Co-authored-by: qunyang <quyang@habana.ai>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>