peft
support merge and unload when using PEFT + DeepSpeed ZeRO3
#1267

Closed

support merge and unload when using PEFT + DeepSpeed ZeRO3 #1267

pacman100 wants to merge 2 commits into main from smangrul/deepspeed-merge-and-unlod

pacman1001 year ago

No description provided.

support merge and unload when using PEFT + DeepSpeed ZeRO3

67ba57a6

HuggingFaceDocBuilderDev1 year ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan1 year ago (edited 1 year ago)

Nice quality of life improvement. I have a suggestion: How about we package the functionality into a utility function, that way it's easier to plug it everywhere we need it (since other adapters besides LoRA will need this too):

@contextmanager
def gather_params_ctx(module: nn.Module):
    """TODO"""
    if is_deepspeed_zero3_enabled():
        import deepspeed

        params_to_gather = module.parameters()
        with deepspeed.zero.GatheredParameters(params_to_gather, modifier_rank=0):
            yield
        return

    # no deepspeed: nothing to do
    yield

Then we can do:

def merge_adapter(...):  # same idea for unmerge_adapter
    ...
    with gather_params_ctx(module):
        module.merge()

address comments and fixes

9d54934d

BenjaminBossan1 year ago

Note that there is some similarity with #1190, maybe the two context managers can (eventually) be merged into one.

jonathanasdf1 year ago

Sorry if this is unrelated, tried patching this to initialize finetuning from after merging a lora adapter, but hitting some error

  File "/opt/venv/lib/python3.11/site-packages/transformers/trainer.py", line 1546, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/transformers/trainer.py", line 1681, in _inner_training_loop
    model, self.optimizer = self.accelerator.prepare(self.model, self.optimizer)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/accelerate/accelerator.py", line 1209, in prepare
    result = self._prepare_deepspeed(*args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/accelerate/accelerator.py", line 1582, in _prepare_deepspeed
    engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/__init__.py", line 171, in initialize
    engine = DeepSpeedEngine(args=args,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 304, in __init__
    self._configure_optimizer(optimizer, model_parameters)
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1225, in _configure_optimizer
    self.optimizer = self._configure_zero_optimizer(basic_optimizer)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1552, in _configure_zero_optimizer
    optimizer = DeepSpeedZeroOptimizer_Stage3(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/zero/stage3.py", line 146, in __init__
    self.dtype = self.optimizer.param_groups[0]['params'][0].dtype
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

BenjaminBossan1 year ago

@jonathanasdf Could you please open a new issue for this and add code to reproduce the error, if possible?

github-actions1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions closed this 1 year ago

BenjaminBossan1 year ago

@pacman100 Would it make sense to revive this PR?

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

peft support merge and unload when using PEFT + DeepSpeed ZeRO3 #1267 Closed

support merge and unload when using PEFT + DeepSpeed ZeRO3 #1267

peft
support merge and unload when using PEFT + DeepSpeed ZeRO3
#1267

Closed