Add heuristic for when `evict_first` should be set (and some other minor things) (#108841)
Example of when the `evict_first` heuristic helps.
```
@torch.compile
def f(a, b):
return (a * b).sum(dim=-1)
N = 512
inps = (torch.randn(N, N, N).permute(2, 1, 0), torch.randn(N, N, N).permute(1, 2, 0))
from torch._inductor.utils import do_bench
print(do_bench(lambda: f(*inps)))
```
This generates code like this: http://ix.io/4HFs
```
Original: 3.8 ms
This PR: 3.54 ms
Always `evict_first: 5.4ms
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108841
Approved by: https://github.com/lezcano, https://github.com/jansel