Upgrade AllenNLP + hotfix (#287)
* Fix AllenNLP inference
* Fix API GA workflow
* Fix dataset name of batch test
* Fix the docker GA workflow
* Ensure AllenNLP test is actually running and not just skipped
* Change env variable
* Fix style, upgrade allennlp, revert some changes
* Fix dependencies of AllenNLP
* Test longer retry
* Format
* Test again