[CPU] FusedAdam and CPU training support (#3991)
* fused adam can build
* use cpu adam to implement fused adam
* enable zero stage 1 and 2 for synchronized accelerator (a.k.a. CPU)
* remove unused parameters
* fix format error
* Remove adam class
* fix format
* support stage3
* reuse simd.h
* fix format
* make memory_stat return meaningful dict
* fix format
* add cpu_adam
* reuse cpu_adam
* header cleanup
* fix cpu_adam
* fix format, add missing file
---------
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>