chore: run tests sequentially with -x to see first error
DEBUGGING MODE:
- Removed -n flag (no parallelism) so errors show immediately
- Added -x flag to stop at first failure
- Kept --tb=short for concise tracebacks
This will be SLOW but you'll see the actual error message as soon
as the first test fails, not after all tests complete.
Once we identify and fix the errors, we'll re-enable parallel execution.