Fix checkpoint conversion when model layers share weights (#3825)
* fix
* remove debug line
* remove debug line
* remove debug line
* add test case
* add test case
* use adam
* fix formatting
---------
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>