Update to new torch grad hook API: BF16Optimizer and Stage2 (#7189)
This commit extends PR
[#6773](https://github.com/deepspeedai/DeepSpeed/pull/6773) to ZerO2 as
well as BF16Optimizer.
Starting PyTorch 2.1 there is a new and robust hook API on a param
itself:
`param.register_post_accumulate_grad_hook()`
A proper API is automatically selected depending on the PyTorch version.
---------
Signed-off-by: Max Kovalenko <mkovalenko@habana.ai>
Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com>
Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: c8ef <c8ef@outlook.com>
Signed-off-by: Hongwei <hongweichen@microsoft.com>
Signed-off-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io>
Signed-off-by: Hongwei Chen <hongweichen@microsoft.com>
Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai>
Signed-off-by: shaomin <wukon1992@gmail.com>
Signed-off-by: Stas Bekman <stas@stason.org>
Signed-off-by: siqi <siqi@tecorigin.com>
Signed-off-by: Wei Wu <wuwei211x@gmail.com>
Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il>
Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
Signed-off-by: Liang Cheng <astarxp777@gmail.com>
Signed-off-by: A-transformer <astarxp777@gmail.com>
Signed-off-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>
Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com>
Co-authored-by: inkcherry <mingzhi.liu@intel.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <loadams@microsoft.com>
Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Co-authored-by: Connector Switch <c8ef@outlook.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io>
Co-authored-by: loadams <loadams@users.noreply.github.com>
Co-authored-by: A-transformer <cl5743590921@gmail.com>
Co-authored-by: Raza Sikander <srsikander@habana.ai>
Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com>
Co-authored-by: shaomin <wukon1992@gmail.com>
Co-authored-by: siqi654321 <siqi202311@163.com>
Co-authored-by: siqi <siqi@tecorigin.com>
Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com>
Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com>
Co-authored-by: snahir <snahir@habana.ai>
Co-authored-by: Yejing-Lai <yejing.lai@intel.com>
Co-authored-by: A-transformer <astarxp777@gmail.com>
Co-authored-by: Ma, Guokai <guokai.ma@gmail.com>
Co-authored-by: Glaceon-Hyy <ffheyy0017@gmail.com>