FEAT Mixing different LoRA adapters in same batch (#1558)
This PR revives the work by Sourab in #903. The core logic is
the same between the two PRs. This one should be more complete.
The main idea is to allow the user to mix different LoRA adapters in the
same batch. This is useful when the user wants perform inference with a
batch that uses different LoRA adapters. Without this, each batch would
have to be restricted to the same LoRA adapter(s).
This PR should encompass:
- all task types
- all LoRA layer types
- bnb layers
Extensive tests were added, as well as documentation.
---------
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>