peft
support merge and unload when using PEFT + DeepSpeed ZeRO3
#1267
Closed

support merge and unload when using PEFT + DeepSpeed ZeRO3 #1267

pacman100 wants to merge 2 commits into main from smangrul/deepspeed-merge-and-unlod
pacman100
pacman1001 year ago
No description provided.
pacman100 support merge and unload when using PEFT + DeepSpeed ZeRO3
67ba57a6
HuggingFaceDocBuilderDev
HuggingFaceDocBuilderDev1 year ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan
BenjaminBossan1 year ago (edited 1 year ago)

Nice quality of life improvement. I have a suggestion: How about we package the functionality into a utility function, that way it's easier to plug it everywhere we need it (since other adapters besides LoRA will need this too):

@contextmanager
def gather_params_ctx(module: nn.Module):
    """TODO"""
    if is_deepspeed_zero3_enabled():
        import deepspeed

        params_to_gather = module.parameters()
        with deepspeed.zero.GatheredParameters(params_to_gather, modifier_rank=0):
            yield
        return

    # no deepspeed: nothing to do
    yield

Then we can do:

def merge_adapter(...):  # same idea for unmerge_adapter
    ...
    with gather_params_ctx(module):
        module.merge()
pacman100 address comments and fixes
9d54934d
BenjaminBossan
BenjaminBossan1 year ago

Note that there is some similarity with #1190, maybe the two context managers can (eventually) be merged into one.

jonathanasdf
jonathanasdf1 year ago

Sorry if this is unrelated, tried patching this to initialize finetuning from after merging a lora adapter, but hitting some error

  File "/opt/venv/lib/python3.11/site-packages/transformers/trainer.py", line 1546, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/transformers/trainer.py", line 1681, in _inner_training_loop
    model, self.optimizer = self.accelerator.prepare(self.model, self.optimizer)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/accelerate/accelerator.py", line 1209, in prepare
    result = self._prepare_deepspeed(*args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/accelerate/accelerator.py", line 1582, in _prepare_deepspeed
    engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/__init__.py", line 171, in initialize
    engine = DeepSpeedEngine(args=args,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 304, in __init__
    self._configure_optimizer(optimizer, model_parameters)
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1225, in _configure_optimizer
    self.optimizer = self._configure_zero_optimizer(basic_optimizer)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1552, in _configure_zero_optimizer
    optimizer = DeepSpeedZeroOptimizer_Stage3(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/deepspeed/runtime/zero/stage3.py", line 146, in __init__
    self.dtype = self.optimizer.param_groups[0]['params'][0].dtype
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
BenjaminBossan
BenjaminBossan1 year ago

@jonathanasdf Could you please open a new issue for this and add code to reproduce the error, if possible?

github-actions
github-actions1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions github-actions closed this 1 year ago
BenjaminBossan
BenjaminBossan1 year ago

@pacman100 Would it make sense to revive this PR?

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone