Adding a child class of hf's rotary embedding to make hf generate work on multiple gpus. (#1334)
* ..
* adding comment
* improving test
* lint
* Update llmfoundry/models/mpt/modeling_mpt.py
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
* addressing comments
---------
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>