Fix EMA for multi-gpu training in the unconditional example (#1930)
* improve EMA
* style
* one EMA model
* quality
* fix tests
* fix test
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* re organise the unconditional script
* backwards compatibility
* default to init values for some args
* fix ort script
* issubclass => isinstance
* update state_dict
* docstr
* doc
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* use .to if device is passed
* deprecate device
* make flake happy
* fix typo
Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>