Shashank/seq id flash attn (#738)
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* Update llmfoundry/models/layers/attention.py
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
* Update llmfoundry/models/mpt/modeling_mpt.py
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
* Update llmfoundry/models/mpt/modeling_mpt.py
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
* ..
---------
Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>