FIX: Avoid caching in X-LoRA generate (#2384)
X-LoRA tests started failing after this transformers PR:
https://github.com/huggingface/transformers/pull/35724
The solution appears to be to disable caching completely when calling
generate on the X-LoRA model. This also makes some previously xfail-ing
tests pass.
I tested this locally with transformers checked out before and after the
mentioned PR and the tests pass in both circumstances. I also tested
changing the base model from "facebook/opt-125m" to
"trl-internal-testing/tiny-random-LlamaForCausalLM" and the tests passed
with both.
Also, mark X-LoRA save_load_function test as flaky.
It was marked as xfail beforehand, but it is in fact just flaky.