pytorch
d553478c - [v1.8] Make TensorPipe work around bug in old versions of libibverbs (#52615)

Commit
4 years ago
[v1.8] Make TensorPipe work around bug in old versions of libibverbs (#52615) The bug affects PyTorch users who meet two conditions: - they have an old version of libibverbs installed (the userspace library), namely older than v25, which dates from Jul 29, 2019; - but they do _not_ have an InfiniBand kernel module loaded. In those cases they will experience a crash (uncaught exception) happening when initializing RPC, mentioning an "unknown error -38". There is a workaround, which is for those users to activate a killswitch (which is private and undocumented) to disable the `ibv` backend of TensorPipe.
Author
lw lw
Parents
Loading