fix: improve test reliability and benchmark workflow
- start-no-build test: Use spawn() directly instead of createNext
to properly handle process exit with error code
- instrumentation-order test: Use indexOf-based order verification
instead of array comparison to handle multiple log occurrences
- benchmark workflow: Use pnpm pack + npm install instead of npm link
to avoid Turbopack workspace detection issues
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>