Back to the post from a week ago …
Where the issue was to know your data, and how that still is a problemo big time, if you can get any sufficient data of seemingly appropriate quality at all.
There’s a flip side, on the ‘engine’ side [ref of course Turing machines but why do I write this you knew that duh], where you #include just about anything of value [won’t link back to that post of mine again…!]; being standard libraries and some some times pretty core stats stuff of which you also don’t know the quality.
The standard things like printf() I guess will by now be hashed out. The stats stuff however … I’d say we’ll hear a lot again about bystanders using the stats but not seeing that there’s possibly some obscure shortcut in them that plays out in destructive ways many <time>s from now. It starts with no Bessel correction on RootMeanSquareError / Variance / StdDev calculations. As if degrees of freedom have no meaning — where on the contrary they ‘come from’ practice into stat mathematics…
Now, having taken you from universalia to particulars, let’s return. Have you checked the proper programming of what is asserted to be the functionality of the very stats / data processing / ML / Deep Learning / AI that you’re using for the edge-of-the-art übercompetitive hence most-profitable lab environment pilot plaything systems of yours..?
I’d think not.
At your peril.
‘Open Source’ doesn’t relieve you of your duties to see, to ensure, to prove to yourself, that at least someone else has falsified the code.
Now what ..?
Whereas, flip side, there may be much more pre-cooked functions out there that you just haven’t looked for hard enough but that are of top-notch quality. Like, in a similar field, I hear too little ‘use’ of this [ don’t just jump to 1:44+ ] and this [ 2:48+ ].
Now, on a positive note:
[Easily recognisable and known to all, right? Bouvigne, Brabant of the North flavour]