DeepSpeed
f887b982 - fix: handle non-existent path in is_nfs_path for Triton autotune cache (#7921)

Commit
30 days ago
fix: handle non-existent path in is_nfs_path for Triton autotune cache (#7921) ### Summary - `is_nfs_path()` in `matmul_ext.py` passes the cache directory path to `df -T` before the directory is created, causing `df: /root/.triton/autotune: No such file or directory` errors on stderr - Fix by walking up to the nearest existing ancestor directory before invoking `df`, which correctly resolves the filesystem type without requiring the target path to exist - Also suppress stderr via `subprocess.DEVNULL` and catch `FileNotFoundError` for environments where `df` is unavailable (e.g., minimal containers) ### Root Cause In `AutotuneCacheManager.__init__`, `TritonCacheDir.warn_if_nfs(self.cache_dir)` is called before `os.makedirs(self.cache_dir, exist_ok=True)`. The `is_nfs_path()` function then runs `df -T` on a path that does not yet exist, which causes `df` to print an error to stderr. While the `CalledProcessError` exception was caught, the stderr output still leaked to the user's terminal. ### Changes - `deepspeed/ops/transformer/inference/triton/matmul_ext.py`: Walk up to nearest existing ancestor before calling `df -T`; suppress stderr; catch `FileNotFoundError` ### Testing - Python syntax validation: PASS - yapf formatting check: PASS (no diff) - flake8: PASS (no warnings) Fixes #7642 Signed-off-by: Krishna Chaitanya Balusu <krishnabkc15@gmail.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Parents
Loading