Since reinforcement learning is too broad a term… we’d better call it continued-in-deployment learning.
Back to the subject, being this CiDL. And how one would audit a system that uses, as one of its many (sic) parts, such a module. The weights change constantly, a little. By which point would one say that the system needs to be re-trained since the implementation ‘certification’ (huh) of e.g., unbiasedness determined once a long time ago, doesn’t apply anymore?
A couple of considerations:
- For sure (sic again), the unbias of yesterday is today’s societally unacceptable bias.
- [Shall we newspeak-change ‘bias’ to ‘prejudice’ ..? That does legally and practically capture better what’s at stake.]
- The cert will have to have a clause of validity at delivery time only, or be a fraud.
- Have we similar issues already, with other ‘algorithms’..? Yes we do. As explained here.
- Since, between the lines, you read there the connection to ‘Dev(Sec)Ops’… That, similar to scrummy stuff, should be no problem to audit nowadays, or … check on your engagement checklist: You do not have the prerequisite knowledge let alone understanding period
- So, how do you audit DevOps developments, for e.g., continued ‘existence’ of controls once devised? How could you not also audit ML performance (vis-à-vis criteria set before training started, in terms of error rates etc.etc.etc. ..?) to see that it remains within a certain bandwidth of accuracy and that’s enough ..? The bandwidth being the remain-materially-compliant-with(in)-controls of other algo levels.
- Side note: How do you today ‘audit’ human performance on manual execution of procedures i.e., algorithms ..??
That’s all for now; details may follow (or not; IP has several meanings…).
Leaving you with:
[Designed for Total Compliance; one of the Big-4’s offices, Zuid-As Amsterdam]