Automatic tensor parallelism v2 (#2670)
* loop through pipe.model
* tp_parser first draft
* client_module must be type object
* Simplify layernorm tracking. Add unittest.
* cleanup
* Add more models to unittest
* cleanup inference pytest for merging
* Add unittest
* cleanup
* pre-commit
* unittest id and pytest marker
* try marian for unittest
* precommit
* Move tp code to seperate file
* Add new auto tp file
* pre-commit and type
* Update deepspeed/module_inject/auto_tp.py
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
* Update deepspeed/module_inject/auto_tp.py
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
* Update tests/unit/inference/test_inference.py
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
* remove unused fillmask function
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>