Add scaled_dot_product_attention support for decoder models (#853)
* add gpt2
* add files
* refactor tests
* support gpt2, better tests
* fix
* more models
* fix gpt neo
* small fixes
* add opt support
* fix tests
* add comment
* fix mock
* fix uninstall
* size
* last fix