Refactoring of attention cuda kernel: move prepare qkv and concat_past_to_present (#17559)
To avoid a huge cu file and make code more readable:
- Move PrepareQKV to separate cu file (attention_prepare_qkv.cu)
- Move ConcatPastToPresent to attention_concat.cu
- Add default value for AttentionData
- Add a data structure QkvData to track Q, K and V pointers and track
QKV format.