[LifetimeSafety] Track dereference operators for GSL pointers in STL (#176643)
Improve lifetime analysis for STL iterators in range-based `for` loops
by tracking dereference operator of GSL pointers.
Added support for dereference operators (`*`) for GSL pointers in STL to
track pointer value instead of `this` arg reference.
Tests:
- Removed a test which started to fail and didn't look correctly
annotated (?).
- Added new test cases for range-based for loop variables and iterator
arrow operators
## **Thoughts on overall direction:**
I feel we are twisting clang too much here to hardcode heuristics which
cannot be otherwise expressed through available annotations. This had
initially started by the intention of implicitly annotating STL to
benefit every implementation out there. But this has moved to a
direction where we have heuristics which cannot be anymore expressed by
annotations (like transparent accessors and inner pointers). This is
also a fallout of making [[clang::lifetimebound]] and [[gsl::*]]
annotations work together.
Some takeaways for me here are:
1. It is great to have inferred annotations for STL but they should be
expressible through explicit annotations especially when STL
implementers are willing to add these (e.g.
https://github.com/llvm/llvm-project/pull/112751).
2. We need a simple way to express transparent accessor functions of GSL
pointers.
### Transparent functions:
```cpp
int* transparent(int* in) { return in; }
```
There is no "outlives" constraint imposed by this function. Instead, it
introduces only an aliasing effect. The return value is an alias to
`in`. This aliasing effect can be expressed through
`clang::lifetimebound`.
```cpp
int* transparent(int* in [[clang::lifetimebound]]) { return in; }
```
Same is true for `view` types.
```cpp
// return view aliases 'in' which points to some char* owned by some 'std::string'.
// This helps in upholding the contract "that std::string should outlive the value returned here".
std::string_view transparent(std::string_view in [[clang::lifetimebound]]) { return in; }
```
This gets complicated when we have references to `view` types.
```cpp
// Not possible to annotate this to express the same effect/contract.
std::string_view transparent2(const std::string_view& in) { return in; }
```
This gets much harder when we want to express this for accessor methods
of view types as implicit this arg is always a reference.
```cpp
class [[gsl::Pointer]] MySpan {
MySpan(MyVector<int>& v [[clang::lifetimebound]]); // 'MySpan' should not outlive 'v'. Ok.
// 'begin' aliases MySpan which aliases 'v'. No way to express this.
const MyVector* begin() const { return begin; }
const MyVector* end() const { return end; }
private:
const MyVector* begin;
const MyVector* end;
};
// Again possible to annotate free methods provided view types is accepted as a value.
const MyVector* getBegin(MySpan my_span [[clang::lifetimebound]]) { // returns the pointee of the pointer 'my_span'.
return my_span.begin();
}
```
The problematic cases are quite similar though:
```cpp
std::string_view transparent2(const std::string_view& in) { return in; }
class [[gsl::Pointer]] MySpan {
const MyVector* begin() const { return begin; }
};
```
These essentially talk about **inner** pointer. For `transparent2`, we
want to express that it aliases the inner pointer of `const
std::string_view&`. For `MySpan::begin()`, we want to the express that
it aliases the inner pointer of `const MySpan&` type which is the type
of implicit `this` parameter.
### Options:
- `[[clang::lifetimebound(2)]]`: Refer to the inner pointer after
peeling the outer pointer/reference. More generally
`[[clang::lifetimebound(x)]]` where `x` is a positive integer to peel
off `x` layers of pointers.
cc: @Xazax-hun