Blog

Where’s Fb’s PI license ..?

Hm, last time I looked [admittedly, some time ago], in many US states PIs had to get a license because they deal in such shady business as profiling. Shady, since it’s an outright intrusion of privacy that’s going on. One-on-one.

Haven’t heard that all that had access to e.g., Facebook’s data and the profiles that can be derived from that, had each and everyone of them been vetted. Yes, all that had access through any account w/ access to any Fb data, may have needed to be licensed.

And now you answer that users per EULA [or whatever legaleeze [was wont to write sleazy but that’s pleo] phrase you’d have for it] agreed to their data being used. But unwitting disclosure (signing off on an unreadable EULA is this, not wilful; ‘disclosure’ as transfer of info for any use elsewhere) and unwilful disclosure are both on the other side v.v. wilful disclosure. Maybe unwitting disclosure isn’t a thing yet, but it is. Any transfer of info is purpose-bound in a narrow sense [yes, legally it has always been]; and derivative info not used for the immediate benefit of the subject only, too, fits the narrow-sense subject-benefit only protection requirement.
Also, it’s not targeted but mass trawling. That not just every state officer even can do; officially, this is allowed to a certain very narrow group only. Why would a private party not have such limits (to zero), then, when it’s not one-on-one but massively upscaled ..?

So, only info explicitly posted to Public, can be shared with that Public and no right is transferred to extract economic value from it. … Well, that’s pushing it, right?

But certainly, no-one has said that licensing suddenly wasn’t required anymore. Including full compliance with all the requirements to get and hold the license. ‘tSeems to have some stuff on info secrecy, right?
And, is this post ‘against’ Fb? May be. Or not ..! Just as some time ago, a lot of people weren’t necessarily against Al Capone and he evaded conviction. Until, he was caught on that most tangential issue, remember ..?

Yes I’m rambling. But still.

And:

[That time of year again… Museumplein]

The 20/20 on Next Year’s Big Things

[Sigh] couldn’t resist the introvert-dad’s joke in title.
On the verge of the last Q of ’19 so you have a little spare time to prep; this, about the really really Big Things that will capture the news next year:

  1. Genetic algorithms (like here), maybe outright towards solving hard problems that ML-training offers no convergence on or, most probably, as an add-on stacked on top of Last Year’s ML results. As mentioned here, but also here and here (with links). Also, when you’re hooked on Python anyway: this;
  2. Some practical solutions à la plastic-eating bacteria going onto large-scale deployment, or CO2-capture into building material or into C/O2 reduction via solar thus producing the much-wanted pure C and pure O2 – some early trials are operational already but Scale will come next year;
  3. Hydrogen cars. Apart from safety issues [but similar safety was solved, adequately not 100,00%… for fossil fuel cars so what’s the big deal — and edited to add: it seems that elecs are catching fire much more often than fossils, and are harder to put out; yet more reason to not jump to elecs], the infrastructure’s mostly there. Just add an underground tank plus pump, right ..? No need to build extensive parallel loading stations that comparative-wise still take ages to fill up. Also, where’s the Formula-H class Grand Prix’ ..? Possibly, we’ll have these in abundance, but in the long term they still may be overtaken [huh. boring….] by Cells. And the Scots are onto something [apart from their wisdom in wanting to Remain; as a separate country, could they ..?]. Hopefully, ‘Shipping’ will be an innovation testbed already next year, qua hydro development, in their hydro environment ;-/ with secondary options (solar) and with sufficient room for installations on-board and qua land-based refuelling points;
  4. Breakthroughs in medicine, being able to cater much better ever quicker to gender/age-specific requirements;
  5. … AI …? Only where BPR-driven. Yes, that’s right; despite the frequent re-name almost every year for the past <somanyyears>, latest was (sic) RPA, it’s still basic BPR in its original meaning not the totallyoverbureaucratised ‘method’. Gartner’s (others) are just a set of Mehhh’s compared to the above.

You’ll see I’m right.
Since #6 I don’t list, being my discovery of how to do time travel. Come to think about that: I discovered that in 2029 …but after and before that, who cares for the discovery date ..?

Now then, I’ll await the veracity of the above, with:

[Ah, what a museum! Drake’s first drill near Allegheny, or near Cleveland which sounds similar to Indianoplace]

Qualified audits/auditors

On the abuse of language: Where on the one hand, some auditors call themselves ‘qualified’ whereas at the same time, they (seldomly) give opinions – as in: ‘statements they want to have the value of hard fact’ – that are ‘qualified’, meaning that there is something seriously wrong with the subject (i.e., object, in epistemological terms) they just don’t know how to put down factually how bad it is.

I can agree to the part where they consider themselves qualified, in the latter sense. Especially those that call themselves qualified. Which often is intended to say that others, that don’t qualify themselves as such, aren’t. Which is truth in reverse.
Also, it’s like being a lady: If you have to say it of yourself, …

But I understand that some call themselves qualified indeed. Like, the members of this charter that ticks too many boxes of the list of characteristics of a criminal organisation. In a literal sense, not even in the figurative one that opines (sic) on auditors in general. Dutch auditors would translate ‘qualified’ opinion into ‘gemankeerd oordeel’, but the ‘gemankeerd’ then also applies to those that qualify themselves as qualified.

But do get rid of the ambiguity or people will remain ambiguous about your capabilities…

That much for now; with:

[Qualified, as useful; once at Glassfever Dordrecht. No, it’s deliberately vague; didn’t you get the reference to the above? Then you may be ‘qualified’…]

Training your way out of bias

No, this is not about bias in data that you train your “AI”i.e.ML on. I’ve posted (not nearly) enough about that.

It’s more about pointing to an HBR article that should be(come) very influential…
As it was already known that any ‘training’ was about as ineffective as one could get, qua e.g., security awareness transfer [posted about that already, too], this piece elucidates why; among others, because it finds those sent to ‘training’ a priori suspect i.e. guilty and punishes, both creating counter-emotions. But read the whole thing; worth it!

Also, in regard to this earlier post: Thinking in win-lose terms … one is all that you can score [ref. Frankie gtH, Two Tribes; yes you got that or you’re n00bish like here, 10th paragraph].
This may be corrected by the above pointer. But be careful; be very careful. People resent nudging; when they find ou about that, they’ll balk against the brainwashing. ‘tSeems like one has to change little habits, big habits, unconscious ones and blatant ones, of every individual individually, and at societal levels — all and all quite a complex/wicked problem (with this), even leaving aside the thingy about manipulating society towards some better ‘good’ which of course leads to the Utopia dystopia, if it were achievable at all. Like, any strive for an ideal society optimises its positive effects at about the infliction point i.e., halfway through its implementation; after that the negatives begin to outdo the positive effects…

Enough for now; leaving you with:

[Don’t get defensive …! Nancy]

Analysis first, data analysis later

This, about the trend to see play time being over and serious business resuming; with ML.

Where no longer, one just let some business-newbie just out of college (or less) play with a bricks set of tools including a minute subset of all relevant vectors [i.e., a tiniest sliver of context], and then have … Tadaaa! A proof of concept, at best. Now, let’s see how small a part in might take in regular business operations — don’t transform your business too much because that might undo the relevance of rules learned ..! Even when ‘AI translators’ come into fashion, they still take things bottom-up more than they dare to admit, still with an ML focus first, seeking anyplace suitable to deploy it and hence reaping point soultions at best.

Where rather one would want to take things from the other side. Top-down. Maybe not all the way from the top, as insights aboot ML may not live there, and insights about how the business is run, also not [quite completely not, often].
But somewhere-middle-out, where some may understand how the business is run. And where process mining may help to understand what actually goes on, and then plot the following:

to see that a lot of processing in any process step would be repetitive, fetchable-in-an-algorithm-of-some-kind [1] simple data handling, plus a degree of human intervention and deviation based on ad-hoc intelligence applied.

Yes, ‘if any’. But also pointing to this piece on the long-awaited, longed-for demise of ‘six sigma’ and its dehumanising characteristics/effects.

So you can choose which process (step/steps) would lend themselves best for possible (sic) disruption by ML-application of some sort.

The some sort of which you then need to figure out. And also see how the ML (possibly/probably no more than) bits fit into a new, redesigned process. Yes, ML may make sense, to a degree to be ascertained, but humans may need to change – luckily for you, the dot on the horizon, which should be waaayyy before the horizon, is for humans to be rid of the mundane stuuf and to be tasked onto the Intelligent stuff. IF empowered, enriched.

So, in the end, hybrid processes may be a thing to aim for; want.

Plus:

[Artes and Ciencias; Valencia]

[1] The bandwidth of classical algorithms, expert systems, Big Data correlations, classifier ML, more-complex ML, basic neural nets, complex neural nets, evolutionary algorithms [or are they a separate, close but parallel track?] as explained in this.

Fully loaded

Omdat er recent alweer hartstikke Foute uitsprakenboel discussies waren over ZP’ers (niet over ZZP’ers), dat die vooral andere belangen zouden hebben – quod non 1 – en dat die zo duur zouden zijn – quod non 2.
QN2, wegens:

Medewerker in vaste dienst verdient € 4000 netto per maand, voor het rekenvoorbeeld. Dat kost de werkgever € 6.487 bruto, volgens alle sites; jaarlijks € 77.844. Plus vakantiegeld € 6.227,52 plus waarschijnlijk een 13e maand € 6.487 plus pensioen € 15.000ballpark plus opleidingsbudget € 1.000 (j..zus wat een kneiterbedragje) en dat alles doorbetaald tijdens vakantie of ziekte… En dan vergeet ik nog een aantal kosten. Oh ja, transitievergoeding; laten we uitgaan van een dienstverband van 10+ jaar dus een halve maand erbij.
Totaal € 109.802,02 voor een 1600 uur aanwezigheid (héél genereus), zijnde [naar is gebleken uit onderzoek, ad 2uur per dag en nog zonder rekening te houden met vakantie-afwezigheid etc.! dus héél genereus] 400 daadwerkelijk productieve uren. Dat is bruto € 194,61 per productief uur.

Om de bruto/netto grosso modo € 110.000 bij elkaar te verdienen — want de ZP’er moet er óók nog pensioen uit opbouwen, zélf cursussen betalen, zélf z’n niet-werk vakantiedagen van opslaan, zélf zorgen voor up-to-date hulpmiddelen, etc.etc.etc., en gegeven een ruwe 30%IB over het bruto uurtarief (want de Beldienst geeft nou niet echt korting voor niet-werken wegens vakantie of zo, en bijdragen in ziekteverzekeringen enzo zelf ophoesten ook bij laag inkomen …) na afdracht BTW,
resulteert dat in een uurtarief van € 275,00.

Van een ZP’er mogen we verwachten dat de productieve/totale uren op zo’n 50% staan en dan reken ik nog van me af [feitelijk zelf administratief bijgehouden: ik kom op 90-95%]. Als het zou gaan om productieve bijdrage aan de organisatie, zou het inkomen en uurtarief van de ZP’er dus het dubbele mogen zijn van dat van een medewerker in vaste dienst.

We begrijpen dat de overheid tegen discriminatie op de arbeidsmarkt is, en uitbuiting van ZP’ers wil vermijden.
We begrijpen dat werk-gevers aan ZP’ers niet meer willen betalen dan aan medewerkers in vaste dienst — waarom eigenlijk niet!? de werkgever mag best betalen voor de flexibiliteit niet telkens te hoeven ontslaan en telkens wel verse precies-passende kennis met elders opgedane ervaring (leren van (andermans) fouten) in huis te kunnen halen..!
Wie niet wil zien dat de werkgever betaalt voor de bijdrage aan de organisatie, moet onmiddellijk zelf ontslag nemen wegens verregaande incompetentie.

Dat betekent dat het uurtarief van een ZP’er ongeveer 2,83 keer zo hoog mag zijn als het bruto uurtarief van een medewerker in vaste dienst, voor dezelfde bijdrage.
In de reguliere praktijk is dat, voor de categorieën werk waar normaliter sub-80k per jaar voor staat, niet 2,83 keer zo veel, maar de helft. Dus mag het uurtarief van de ZP’er zo’n 5,65 keer hoger om gelijkuit te komen. Do the math.

Hoe het ook wordt gewend of gekeerd, het idee dat een ZP’er op bruto uurtarief ongeveer even ‘duur’ zou moetenmogen zijn als een medewerker in vaste dienst, is dikke oplichting. Nee, dat is een eufemistische kwalificatie, geen over- maar onderdrijving.

Wachtende op gerechtvaardigde-uurtariefopdrachten…, met:

[Ja maar lekker vast in je hokje zitten is zo fijn! – precies, dus dat voordele in natura mag wel van je inkomen af…? Zuid-As]

Small. ish.    Nope.

On several occasions it struck me that, adviseconsulting on subjects like AI deployment and information risk management [usually not in one go], nowadays the relation between company ‘size’ and headcount seems to have gone less strictly linear. Like, there’s still a lot of big org’s out there that do have large numbers of fte’s

[Skipping for a moment the subject of their productivity towards the bottom line; drilling down one often finds that ‘profit’ or even turnover is more of an emergent property than specifically allocatable to individual KPIs (don’t claim that executives meet their KPIs and are the money makers – that claim is a delirious scam), thus calling into question the idea that there’s tons of dead wood around that could be weeded out. That is against one of my previous hobby horse by the way; thanks for noticing, but I’m not above giving in to nuance, on the contrary huh]

but now, there’s also a fair number of clients with quite limited colleague/’member’ numbers that still have huge turnover — in terms of what counts [ever more]: data processing. Got’ya; I didn’t write ‘information’ for a reason. As in: When impact on clients/customers is the Value rigœur of the latter day [oh not that again], these scale-ups make a splash waaayy beyond their size. Or turnover or profitability even; those become less and less tending to zero relevant anymore. It seems. And it explains valuations better than said three measures of ‘size’ or ‘impact’.

So, shouldn’t we start to compare the soon-to-meet-Schumpeter’ian organisations to similarly-sized-data-processing organisations of any kind, and then conclude which ones are more efficient? Turnover, profits, headcounts don’t count anymore.

Uhm. Now what.

Oh, at least, this:

[Sending data to once a mighty empire …? Coincidence: The Empire Home truck; London]

Don’t forget GDPR when untraining your ML

Training ML systems is bound to use personally identifiable information, PII usually dubbed personal information. This latter thing diminishes the scope, way too much, by leaving out that any bit of information that in conjunction with outside sources of any kind, can be used to identify a person, is PII.[1]
Under GDPR, there’s the right to be forgotten… Now there’s two problems:

  • Sometimes, data points can be retrieved literally from the trained system, like here. Clearly, such data points need to not be reproduced anymore, then. But how to un-learn an ML system when the data point involved, needs to be forgotten? [2]
  • Similar less literal cases apply. E.g., when it’s not one data point that’s regurgitated but the one does have an off-average value in the weight/trained parameter. Which is probable, since an ML system hardly learns from n times an average value [it may but then, that’s not ML but fixed function learning, fixed ‘algorithm’ wise] but from n different values, the one of concern among them. How to get the contribution out of the weights, and how to prove (which you may have to, under GDPR obligations though only when push comes to shove) that your ML weights no longer include that one data point its impact on the weights ..?

It’ll be fun, they said. For lawyers, they said.

Still, the whole thing may need to be figured out before anyone can deploy any ML system that included European citizens’ data — since the GDPR has global effect.
Now you have fun, I say.

With:

[You probably are on camera here…]

[1] Side note: I was wont to write ‘can and will’ which is true but sounds too much like ‘anything you say can and will be used against you in a court of law’ [disregarding the exact wording], which will of fact alter what I may say as I now include the consideration of what and how I say things subsequently. To which I ask: When not if, not all that I’d say is actually used in a court of law, does this invalidate the statement made to me, rendering the ‘can’ part invalid i.e., the respective speech part(s) that are used, illegal(ly obtained) evidence ..? Since I say things other and/or differently than without the statement at arrest i.e. based on a statement by a sworn officer that is later proven false, perjurious even. Entrapment? That’s illegal in many circumstances…
Would want to know from a legal scholar how this works.

[2] Most probably, you will not be able/allowed to keep that data point for any specific reason. To say that it’s too difficult to get the data point out of the trained system: Does. not. work. The law just require you to do the near-impossible; your mistake. Just train the system all over again why would anyone care for your interest? GDPR requires you to only ask how high you have to jump and then do that, whether you’d have to set a world record or not.

Diffuse parameters, diffusing laws

Already, we were aware that

  • With ML systems, the lines between software/fixed algorithms, parametrisation and semantic meaning of the outcomes, are blurred. We have no ‘place’ where the ‘logic’ sits or is stored/used; it’s all getting mushy and that’s not a good thing;
  • The law wants neat yearsteryears’ algorithms (protocols, parameters, provable actions upon intent, etc.);
  • Adversarial AI exists, whether we call it AI or the mere ML that it is.

All these three in concert, don’t give hope. Like explained here and more profoundly here, ‘hacking’ may not be appropriately defined, if it currently is at all, once one uses ad-AI to mess with ML-driven (literally) systems. The latter is more like solicitation or so…?
To expound, I copy a little from the article:

Unless legal and societal frameworks adjust, the consequences of misalignment between law and practice include (i) inadequate coverage of crime, (ii) missing or skewed security incentives, and the (iii) prospect of chilling critical security research. This last one is particularly dangerous in light of the important role researchers can play in revealing the biases, safety limitations, and opportunities for mischief that the mainstreaming of artificial intelligence appears to present.

… why this lack of clarity represents a concern. First, courts and other authorities will be hard-pressed to draw defensible lines between intuitively wrong and intuitively legitimate conduct. How do we reach acts that endanger safety—such as tricking a driverless car into mischaracterizing its environment—while tolerating reasonable anti-surveillance measures—such as makeup that foils facial recognition—which leverage similar technical principles, but dissimilar secondary consequences?
Second, and relatedly, researchers interested in testing whether systems being developed are safe and secure do not always know whether their hacking efforts may implicate federal law … Third, designers and distributors of AI-enabled products will not understand the full scope of their obligations with respect to security.

Yes there’s a call to action.
Since “We are living in world that is not only mediated and connected, but increasingly intelligent. And that intelligence has limits. Today’s malicious actors penetrate computers to steal, spy, or disrupt. Tomorrow’s malicious actors may also trick computers into making critical mistakes or divulging the private information upon which they were trained.
Haven’t heard too much reflection on this, yet.
Would definitely want to hear yours. Please.

[Edited to add: Do also read between the lines of this, qua probably mostly surreptitios data capture contra the GDPR… And what if I want my data to be removed from the ML-parameters ..?? See upcoming Monday’s post]

Oh, and:

[On Mare Nostrum I mean Mare Liberum, the legal ship may have sailed. On a vast expanse of not much. Outside Porto]

Boring Under 30s …

Just when you thought about getting into it, maybe, from somewhere near the bottom… One should be careful to know what the bottom looks like.
Qua diving into ‘Data Science’ quod non, that so many have put their personal hopes in, but … tempted how and why ..?

Earlier, I posted this, on how all the Fourth Estate – as far as independent and also focused on others that might still be independent, now apparently unwanted and to be turned into 4thE sheeple – wrote about how one would have to slave oneself to death for the most minute chance of Making It.

Then, news came around that actually, it seems like the Model doesn’t work anymore… In this more recent piece, and various linked posts (and external articles) therein.

Today, even more support for the above warning. Maybe not for some (e.g., him), that had the appropriate insights long ago already and surfaced to surf, if we may express it that way, and only still need to get a suitable spot – or this, if you know the place or (have) be(en) there.

Also:

Now then, I’ll leave you with the Today’s Link to study and weep, and:

[Not quite the above, but close …?? Just North of Siena]