PR #2010 Enable multiple LoRa adapters

Enable multiple LoRa adapters #2010

drbh merged 41 commits into main from lora-internal

drbh force pushed from 091f2dce to d103264e 1 year ago

drbh marked this pull request as ready for review 1 year ago

drbh changed the title ~~Lora internal~~ Enable multiple LoRa adapters 1 year ago

drbh requested a review from

OlivierDehaene 1 year ago

drbh requested a review from

Narsil 1 year ago

drbh requested a review from

danieldk 1 year ago

feat: first draft load multiple lora

db3d8e65

feat: load weights within layer and refactor lora pass

0a6ea7fb

fix: refactor and reduce lora math

a046c303

feat: baseline impl single request multi lora support

c6616312

feat: prefer lorax implementation and port loading logic

8b50f4b7

fix: prefer adapter_data and refactors

d5f21d57

feat: perfer loraxs custom punica kernels and add mlp loras

8984ce6c

fix: adjust batch for bgmv

ad088d51

fix: adjust adapter_segments logic when in batch

c9273767

fix: refactor and move changes to v3 proto

73eb2ae2

fix: pass model_id for all flash causal lms

88bd5c2c

fix: pass model_id for all causal and seq2seq lms

dc0f7655

fix: add model_id to model test

9c45d349

feat: add lora support to mistral and refactors

de56a81c

feat: prefer model id in request

68399c1a

fix: include rust code for adapter id

81707bfb

feat: bump launcher and add new lora docs

43ec9dfe

feat: support base model generation and refactors

611225f0

fix: rename doc to retry ci build

a563a931

feat: support if vlm models

91f40722

fix: add adapter_data param and avoid missing layers

b1169273

fix: add adapter_data param to phi and neox

1deb3725

fix: update all models forwards to include adapter_data

101b95ad

fix: add model_id to IdeficsCausalLM

ce40ad26

Update lora.md

1be1ebc4

Update lora.md

d6cf63ca

drbh force pushed from 5a0ed2b3 to d6cf63ca 1 year ago

fix: add lora kernel to dockerfile, support running without kernels a…

aa88c4fd

fix: avoid dockerfile conflict

06c3254c

fix: merge 'main' into lora-internal to resolve conflicts

0e1c28ca

Merge branch 'main' into lora-internal

1104885f

Merge branch 'main' into lora-internal

224455f3

danieldk commented on 2024-06-19

fix: refactors and adjust flash llama lora logic

4f1543d3

fix: skip llama test due to CI issue (temp)

ce70fce9

fix: skip llama test CI (temp) 2

c9e4526b

fix: revert skips and prefer updated ci token for tests

a07b6129

danieldk commented on 2024-06-20

danieldk commented on 2024-06-21

fix: refactors and helpful comments

3c9b28ea

fix: add noop in TensorParallelAdapterRowLinear too

c927cffb

fix: refactor and move shard_lora_weights logic

f94f2b3e

Merge branch 'main' into lora-internal

0d496baa

danieldk dismissed these changes on 2024-06-25

fix: exit early if no adapter_data

a2d821c4

drbh dismissed their stale review via a2d821c4 1 year ago

Merge branch 'main' into lora-internal

59575fe6

drbh merged 04e1af94 into main 1 year ago

drbh deleted the lora-internal branch 1 year ago

Reviewers

danieldk

OlivierDehaene

Narsil

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

text-generation-inference Enable multiple LoRa adapters #2010 Merged

Enable multiple LoRa adapters #2010

text-generation-inference
Enable multiple LoRa adapters
#2010

Merged