onnxruntime
63c1d1ac - [NV TRT RTX EP] Reconfigure memory arena to grow with power of 2 (#25800)

Commit
168 days ago
[NV TRT RTX EP] Reconfigure memory arena to grow with power of 2 (#25800) This reconfiguration is done to NOT allocate tensors with an exact matching size. If that strategy is used a tensor will always trigger an allocation in the arena and not reuse memory since the memory size has to exactly match. This became a big problem with ORT GenAI since the arena grew constantly when prompting with different prompt lengths. No arena shrinkage was triggered to return older tensors. @skottmckay I am happy to be educated of a better usage of the allocators. Issues with this:  Since the arena is not used for workspace allocations anymore (using reserve) it will likely not be possible in the future to allocate on a stream and immediately free memory after an enqueue call. That could have enabled workspace sharing in a multi model pipeline very nicely. @chilo-ms can you help merge this.
Author
Parents
Loading