Enable possibility to not use past_key_values (#241)
* Enable possibility to not have past_key_values as inputs
* Fix decoder with past saving
* Add tests
* Fix test
* Fix docstring
* Adapt documentation
* Fix comment
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* Fix comment
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* Fix comment
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
* Rename variable use_past_key_values to use_cache to keep consistency with transformers
* Add copyright
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>