[llvm-dwp] Replace MCStreamer with direct ELF writer for zero-copy output (#192112)
Replace the MCStreamer-based output pipeline with a lightweight direct
ELF writer (DWPWriter). Section data is stored as zero-copy StringRef
chunks pointing to the mmap'd input files, and written as a minimal
ELF64 relocatable object directly to disk.
## Rationale
The MCStreamer pipeline copies all section data into 16KB MCDataFragment
blocks, accumulates them in memory, then writes everything out during
MCAssembler::Finish(). This can be cause lots of memory pressure and
slow down llvm-dwp.
For instance, on a 3.3GB DWP file, this translates to rougly ~3.3GB of
heap allocation and two full copies of the data.
The new DWPWriter avoids this via:
- emitBytes() stores a StringRef chunk (zero-copy, no allocation)
- emitIntValue() writes to a small per-section buffer (index tables)
- writeELF() streams chunks directly from input mmap to output file
- for single-input DWP files, string deduplication is also skipped since
all strings are already unique. (minor optimization)
Bonus: this also removes all MC library dependencies from llvm-dwp
(AllTargetsCodeGens, AllTargetsDescs, AllTargetsInfos, MC,
TargetParser), reducing the binary size.
## Benchmark
I benchmarked on a 3.3GB production DWP file (8638 CUs, ~981MB
.debug_str.dwo):
Results:
Before: 23.6s wall (19.6s user, 3.9s sys)
After: 6.0s wall (3.0s user, 2.9s sys)
**3.9x** wall time improvement, 6x fewer page faults (178K vs ~1M).