[GHF] Remove CC line from commit message (#88252)
This line is added by autoCCBot, but is not really meaningful as commit
message
Test Plan:
```
>>> from trymerge import GitHubPR, RE_PR_CC_LINE
>>> import re
>>> pr=GitHubPR("pytorch", "pytorch", 87809)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'Fixes #ISSUE_NUMBER\r\n\n\n'
>>> pr=GitHubPR("pytorch", "pytorch", 87913)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'Parallel compilation warms the Threadpool when we call `torch._dynamo.optimize()`. In current benchmarks, we were setting up the TRITON_CACHE_DIR much later. Because of this parallel compilation artifacts were not used and compilation latency improvements were not visible in dashboard. This PR just prepones the setup of TRITON_CACHE_DIR.\n\n'
>>> pr=GitHubPR("pytorch", "pytorch", 85692)
>>> re.sub(RE_PR_CC_LINE, "", pr.get_body())
'This PR sets CUDA_MODULE_LOADING if it\'s not set by the user. By default, it sets it to "LAZY".\r\n\r\nIt was tested using the following commands:\r\n```\r\npython -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"\r\n```\r\nwhich shows a memory usage of: 287,047,680 bytes\r\n\r\nvs\r\n\r\n```\r\nCUDA_MODULE_LOADING="DEFAULT" python -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"\r\n```\r\nwhich shows 666,632,192 bytes. \r\n\r\nC++ implementation is needed for the libtorch users (otherwise it could have been a pure python functionality).\r\n\r\n'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88252
Approved by: https://github.com/xuzhao9, https://github.com/izaitsevfb