Oh how shiny is the world of Extract-Transform-Load in data analysis.
But it’s not. It’s the major fun spoiler in that. Now, I see a better ‘name’ being used [proposed?], in this also otherwise insightful piece.
WEC it will be. Wrangling, Exploration, Cleaning.
All three much needed. The W+C parts, hinting at the average quality of your data, certainly. [The E part having an air of clean, still.] But the W+C parts also hinting at the introduction of bias, etc. – as already flagged here, unbiasing may not be so easy – hence hinting at the return of the old Law of Conservation of Trouble.
When it comes to the use of ML in accounting, for example; there, the less the data is touched the better. Any uncleanness a signal of weakness; scouring for exceptions is not so easy when these very things are cleaned out (or even lost in the wrangling, not conceded).
Oh well. It’s Friday; we’re off now with:
[When the weather agrees; Gent]