[NV TRT RTX EP] Reconfigure memory arena to grow with power of 2 (#25800)
This reconfiguration is done to NOT allocate tensors with an exact
matching size. If that strategy is used a tensor will always trigger an
allocation in the arena and not reuse memory since the memory size has
to exactly match.
This became a big problem with ORT GenAI since the arena grew constantly
when prompting with different prompt lengths. No arena shrinkage was
triggered to return older tensors. @skottmckay I am happy to be educated
of a better usage of the allocators.
Issues with this:
Since the arena is not used for workspace allocations anymore (using
reserve) it will likely not be possible in the future to allocate on a
stream and immediately free memory after an enqueue call. That could
have enabled workspace sharing in a multi model pipeline very nicely.
@chilo-ms can you help merge this.