Avoid duplicating memory for tied weights in `dispatch_model`, and in forward with offloading (#2330)
* wip
* fix
* add test
* cleanup
* style
* style & tests pass
* fix offload, submodules
* cleanup
* Update tests/test_big_modeling.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update tests/test_big_modeling.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* disk offloading do not reload tied parameters in memory
* remove outdated comment
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>