llvm-project
fdce869d - [llvm-mca][Darwin] Fix crash on .subsections_via_symbols in asm input (#182694)

Commit
61 days ago
[llvm-mca][Darwin] Fix crash on .subsections_via_symbols in asm input (#182694) ## Summary This PR fixes an llvm-mca crash on Darwin assembly containing `.subsections_via_symbols`. The directive is forwarded by `DarwinAsmParser` to `emitSubsectionsViaSymbols()`, which crashes when it hits the base `MCStreamer` `llvm_unreachable` path. The fix adds a no-op override in `llvm/tools/llvm-mca/CodeRegionGenerator.h`, scoped to llvm-mca only ## Problem manifestation I ran across this while tinkering around with making an interactive interpreter/code analyzer and implementing apple silicon support. ## Root cause - `DarwinAsmParser::parseDirectiveSubsectionsViaSymbols` calls `getStreamer().emitSubsectionsViaSymbols()` - llvm-mca uses `MCStreamerWrapper`, derived from `MCStreamer` - `MCStreamerWrapper` did not override `emitSubsectionsViaSymbols()` - the call therefore reached `MCStreamer::emitSubsectionsViaSymbols()`, which is `llvm_unreachable`, causing a crash ## Reproduction I was able to reproduce this in three ways (note: no difference in behavior was observed with the inclusion/exclusion of `-target arm64-apple-macos` or any other arch flags): 1. AppleClang-generated assembly: ```bash cat > /tmp/repro.cpp <<'EOF' int foo(int x) { return x + 1; } EOF /usr/bin/clang++ -target arm64-apple-macos -O0 -S /tmp/repro.cpp -o /tmp/repro-appleclang.s /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/repro-appleclang.s ``` 2. Homebrew Clang-generated assembly: ```bash cat > /tmp/repro.cpp <<'EOF' int foo(int x) { return x + 1; } EOF /opt/homebrew/opt/llvm/bin/clang++ -target arm64-apple-macos -O0 -S /tmp/repro.cpp -o /tmp/repro-hbclang.s /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/repro-hbclang.s ``` 3. Handwritten Darwin assembly: ```bash cat > /tmp/min.s <<'EOF' .text .subsections_via_symbols .globl _foo _foo: ret EOF /opt/homebrew/opt/llvm/bin/llvm-mca --show-encoding --register-file-stats /tmp/min.s ``` ## Fix Add a no-op `emitSubsectionsViaSymbols()` override to `MCStreamerWrapper` in `llvm/tools/llvm-mca/CodeRegionGenerator.h`. This keeps the fix local to llvm-mca’s analysis streamer and does not change Mach-O object emission behavior. A similar pattern fix is implemented in `llvm/lib/Object/RecordStreamer.h` ([link](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/RecordStreamer.h#L57-L60)). ## Validation Reproduced and verified on macOS arm64 with the above three reproduction cases: 1. assembly generated by AppleClang 2. assembly generated by Homebrew Clang 3. manually-authored `.s` file containing `.subsections_via_symbols` `llvm-mca --show-encoding --register-file-stats` no longer crashes!! ## Notes From what I can surmise, this is a consumer-side fix (llvm-mca parser/streamer), not a producer fix in clang, since the directive is valid Darwin assembly and can appear in external input files. This seemed to me like the most targeted, distilled fix that follows the same pattern used elsewhere. Happy to revise the approach if there’s a better fit for llvm-mca, thanks for taking a look!!
Author
Parents
Loading