Enable add + softmax fusion for Rocm platform (#6259)
* add bias softmax; tests appear to pass
* check fusion occurs for rocm as well
* check for rocm provider compatible as well
* build for cpu scenario as well
* try again; broader cope
* proper scope on kGpuExecutionProvider
* been editing wrong file
* remove commented #include lines
* try again due to mac os ci error
* try again
* test fusion both cuda and rocm to avoid mac ci error