Data Assurance (DA) : Over 3 trillion served
For the old(er) folks amongst us … remember as a kid when you’d pass (or maybe stop at) the golden arches (aka McDonald’s) and they had that sign out front: “Over 1 billion served’? Then over time that ‘1’ became 5 … then 10 … then 99 … until now it’s just ‘billions and billions served’?
Well, at my current project I’ve been keeping track of some DA numbers:
- Jan 15, 2019 – 1 trillion rows processed
- 750 billion rows (compare_mode = row_compare)
- 250 billion rows (compare_mode = direct_mat)
- May 20, 2019 – 1.5 trillion rows processed
- 1.1 trillion rows (compare_mode = row_compare)
- 400 billion rows (compare_mode = direct_mat)
- October 10, 2019 – 2 trillion rows processed
- 1.5 trillion rows (compare_mode = row_compare)
- 500 billion rows (compare_mode = direct_mat)
- April 27, 2020 – 3 trillion rows processed (just made it as the project is wrapping up)
- 2.4 trillion rows (compare_mode = row_compare)
- 600 billion rows (compare_mode = direct_mat)
That’s right, yours truly is a bona fide member of the DA Trillion Row club.
Anyone else in the DA Trillion Row club? Anyone hit 10 trillion rows? 100 trillion?
Any masochists that have pushed 1 trillion rows through rs_subcmp?
Drop a line in the comment section with the number of rows you’ve pushed through DA.
- see my recent blog post (bulk data copy from Oracle to HANA) for more details on the new compare_mode=direct_mat feature
- counts are based on the number of rows in the source data set
- on my current project it’s not uncommon to run DA/compare jobs (compare_mode=row_compare) several times against a database. so the total number of rows in the client’s databases are a good bit less than 1.5 trillion
- a few 10’s of billions of rows have fallen through the cracks due to aborted jobs, lost job reports, etc