spark
72575d0b - [SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes .

Commit
6 years ago
[SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes . This passes a unique attempt id to the Hadoop APIs, because attempt number is reused when stages are retried. When attempt numbers are reused, sources that track data by partition id and attempt number may incorrectly clean up data because the same attempt number can be both committed and aborted. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #21616 from vanzin/SPARK-24552-2.2.
Author
Marcelo Vanzin
Parents
  • core/src/main/scala/org/apache/spark/internal/io
    • File
      SparkHadoopMapReduceWriter.scala