Support input dimension swap in Attention op (#5774)
* checkin cpu
* checkin cpu
* add test
* cuda
* update comments
* review comments
* update
* modify var name
* remove unnecessary error msg
* fix comments
Co-authored-by: wangye <wangye@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>