Adding support for Rotary Position Embeddings (#675)
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* removed the roformer impementation of rope
* ..
* fixed all the lint errors
* ..
* ..
* ../llmfoundry/models/mpt/modeling_mpt.py
* ..
* ..
* ..
* added unit test to test rotary embeddings
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* Update llmfoundry/models/mpt/modeling_mpt.py
Accepting the suggestion
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
* incorporated some suggestions from the pr
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* added mark for gpu in the rotary embedding test
* ..
* ..
* ..
* removed thecode for hf implementation of rope
* ..
* ..
* added tests
* ..
* ..
* ...
* ..
* ..
* ..
* ..
* ..
* fixed the tests after the merge
* minor change
* Fixed some tests failing due to a transformers library bug
* added check for flash_attention before importing their rotary embedding
* added check for flash_attention in tests before using dail rope
* fixed tests
* ..
* ..
* temporary fix
* ..
* ..
* fixed a test
* ..
* minor change
* minor changes
* added documentation
* added documentation
* temp commit
* made _set_config_defaults recursive
* minor changes
* reformatted tutorial table
* reformatted tutorial table
* reformatted tutorial table
* added documentation on how to install flash attention 2
* minor changes
* minor changes
* minor changes
* minor changes
* minor changes
* minor changes
* ..
* resolved some comments from the PR
* fixed tests
* modified is_flash_v2_installed
* minor changes
* Update TUTORIAL.md
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
* Update TUTORIAL.md
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
* Update TUTORIAL.md
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
* Update TUTORIAL.md
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>
* resolved PR comments
---------
Co-authored-by: Shashank Rajput <ashank.rajput@databricks.com>
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>