I want to talk about past work on this and how it works for ORES.
Right now, ORES' primary mechanisms for accountability look a lot like the rest of software around Wikipedia. We have public work boards, public processes, public (machine readable) test statistics, and we publish datasets for reuse and analysis. We encourage and engage in public discussions about where the machine learning models succeed and fail to serve their intended use-cases. Users do not have direct power over the algorithms running in ORES, but they can affect them through the same processes that are infrastructures are affected in Wikipedia.
This may not sound as desirable as a fully automated accountability dream that allows users more direct control over how ORES operates, but in a way, it may be more desirable. I like to think of the space around ORES in which our users build false positive reports and conversations take place as a massive boundary object through which we're slowly coming to realize what types of control and accountability should be formalized through digital technologies and/or rules & policies.
At the moment, it seems clear that the next major project for ORES will be a means to effectively refute ORES's predictions/scorings. Through the collection of false positive reports and observations about the way that people use them, we see a key opportunity to enable users to challenge ORES' predictions and provide alternative assessments that can be included along with ORES' predictions. That means, tool developers who use ORES will find ORES' prediction and any manual assessments in the same query results. This is still future work, but it seems like something we need and we have already begun investing resources in bringing this together.