Fix num_heads in _upad_input (#26490)
* Fix num_heads in _upad_input
The variable num_key_value_heads has falsely been named num_heads, which led to reshaping the query_layer using the wrong attention head count. (It would have been enough to use the correct variable self.num_heads instead of num_heads, but I renamed num_heads to num_key_value_heads for clarity)
* fixed copies using make fix-copies and ran make fixup
---------
Co-authored-by: fseiler <f.seiler@jerocom.de>