The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Thanks for making modules_to_save
work with DeepSpeed Zero-3. LGTM.
Login to write a write a comment.
What does this PR do?
deeepcopy
performed inModulesToSaveWrapper
class doesn't work as expected and creates a new module with 0 parameters. As such, this results in following error when training:modules_to_save
config option are gathered across processes usingdeepspeed.zero.GatheredParameters
.without this PR, the above stated error is given and with this PR the fine-tuning happens successfully.
launch command:
Fixes: huggingface/transformers#24445 (comment)