[Backend] Bump TRTLLM to v.0.17.0 (#2991)

Commit

1 year ago

[Backend] Bump TRTLLM to v.0.17.0 (#2991) * backend(trtllm): bump TRTLLM to v.0.17.0 * backend(trtllm): forget to bump dockerfile * backend(trtllm): use arg instead of env * backend(trtllm): use correct library reference decoder_attention_src * backend(trtllm): link against decoder_attention_{0|1} * backend(trtllm): build against gcc-14 with cuda12.8 * backend(trtllm): use return value optimization flag as as error if available * backend(trtllm): make sure we escalade all warnings as errors on the backend impl in debug mode * backend(trtllm): link against CUDA 12.8

References

#2991 - [Backend] Bump TRTLLM to v.0.17.0

Author

mfuntowicz

Parents

36223f83

text-generation-inference 856709d5 - [Backend] Bump TRTLLM to v.0.17.0 (#2991)

text-generation-inference
856709d5 - [Backend] Bump TRTLLM to v.0.17.0 (#2991)