Propagate the model loading from transformers serve to chat (#44758)
* Propagate the model loading from transformers serve to chat
* Docs and tests
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* logigng update
* Adjust docs re Marc's comment
* Remove model name if too long for current console size
* Refactor dual model loading w/ locks
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>