Core: Track duplicate DVs for data file and merge them before committing #15006
geruh
commented
on 2026-01-11
Core: Merge DVs referencing the same data files as a safeguard
82cced93
Fix dangling delete tests
e41943d2
Simplification in OutputFileFactory
76e24e40
minor optimization
a740ff91
cleanup, make outputfilefactory take in more fields so that we don't …
11ffc2f4
change the duplicate tracking algorithm, fix spark tests
772e3c20
Add more tests for multiple DVs and w equality deletes
3404a860
Rebase and fix spark 4.1 tests
c04d0e0e
more cleanup, put dvfilewriter in try w resources
a39b0737
geruh
commented
on 2026-01-11
Add logging, some more cleanup
a079d223
more cleanup
d7eadb00
amogh-jahagirdar
changed the title Core: Merge DVs referencing the same data files as a safeguard Core: Track duplicate DVs for data file and merge them before committing 3 days ago
rdblue
commented
on 2026-01-13
rdblue
commented
on 2026-01-13
rdblue
commented
on 2026-01-13
rdblue
commented
on 2026-01-13
Make dv refs a multimap, group by partition to write single puffin fo…
0a053a6c
Filter files with duplicates before sifting through them and merging
6b04dd98
update old comment
a50fb321
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub