spark
72575d0b - [SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes .

Commit

6 years ago

[SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes . This passes a unique attempt id to the Hadoop APIs, because attempt number is reused when stages are retried. When attempt numbers are reused, sources that track data by partition id and attempt number may incorrectly clean up data because the same attempt number can be both committed and aborted. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #21616 from vanzin/SPARK-24552-2.2.

Author

Marcelo Vanzin

Parents

a6000045

Files1

core/src/main/scala/org/apache/spark/internal/io
- SparkHadoopMapReduceWriter.scala

spark 72575d0b - [SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes .

spark
72575d0b - [SPARK-24552][CORE][BRANCH-2.2] Use unique id instead of attempt number for writes .