Fix BLEURT evaluation errors (#316)
These changes address the issues described in: #315
I made the code changes such that it built on the BERTScore changes (#311) that haven't been merged yet, so we see those changes here. Please let me know if there is preference on removing those from this PR. Thank you!
---------
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Nathan Habib <30601243+NathanHB@users.noreply.github.com>