Pass down the new DS inference config to replace_transformer_layer. (#2539)
* pass down the new DS inference config to replace_transformer_layer.
* remove quantize_settings and rename the ep_mp_group.
* Fix model_config passing. Fixes gptj issue with wrong output.
* fix small bug in gpt-neo.
Co-authored-by: Reza Yazdani and Michael Wyatt