[cuDNN V8 API] (reopen 2) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#78299)
Reopen of #77002 to address comments by @malfet
CC @ngimel @ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78299
Approved by: https://github.com/ngimel