Wikimedia Technical Conference/2018/Session notes/Integrating machine learning into our products

Theme: Defining our products, users, and use cases
Type: Session
Facilitation Exercise(s): Small group discussions

Leader(s): Aaron Halfaker
Facilitator: Kate
Scribe: Michael

Description: This session looks at the use of machine learning and other types of automated assessments in the Wikimedia ecosystem. We’ll discuss what Wikimedia needs to do in order to embrace the challenges of operating infrastructure for machine learning. We’ll discuss the interface between long term maintenance of AI services with new product development.

Questions to answer during this session edit

What is an ecosystem? What’s a technology ecosystem? What makes an ecosystem healthy? edit

We talk about our “technology ecosystem” but does anyone really understand what an ecosystem is, how they operate, and what their constraints are? This question is important so that we can develop a common language and a common understanding of what technical ecosystem health looks like.

Session notes: https://etherpad.wikimedia.org/p/wmtc18_ml_ecosystem

Where has ML been used within the Wikimedia ecosystem? What are some successes we can be inspired by? What kinds of predictions/assessments/rankings do we want to have access to next? edit

Machine learning is a relatively new technology. Most people don’t understand what it is or what it can do for them. Through discussing the impacts that ML has already had, participants will gain a grasp of what ML has to offer and why it may be worth substantial investment of time and resources. Examples include simple classifiers(ORES), similarity indexes (Elastic search), and the merging of the two (LTR). Is the next step general recommender infrastructure? Image processing? Knowledge integrity? What do we need to do in the next 5 years.

Session notes: https://etherpad.wikimedia.org/p/wmtc18_ml_where

Conclusion

Machine translation
- Image classification/similarity for Commons
- Need humans to teach the algorithms & multilingual support
Interest routing -- Finding people interested in same topic - among editors, to connect new editors with experienced people who can help them
- Topic categorization
- To target readers and help people find what they are interested in
- We can use it to cluster search results
- We can do targeted outreach to authors with expertise on a subject.
Vandalism detection in underserved languages
Citation and trustworthiness
- Look across article to resolve [citation needed]
- Evaluate sources
- Identify fake news
- Cross-wiki fact alignment (is this statement supported in Wikidata)
Develop an AI portal with our models and datasets to support external researchers
Quality scoring for code contributions

What does ML cost? What kind of time and resources do we need to make ML sustainable? edit

ML might seem like magic, but it’s definitely not free. ORES and the Scoring Platform team are an example of what it takes to invest in ML infrastructure. Knowing what it costs to maintain an ML service can help us know how to plan our investments wisely. It can also help us avoid under-investing and thus creating weak foundations on which to build.

Session notes: https://etherpad.wikimedia.org/p/wmtc18_ml_cost

Conclusion

Consider the hidden cost of project management and outreach work. Human labeled data is surprisingly expensive.

How do we integrate automated assessments in to the wiki interface? What concerns present themselves when machines begin to encroach on subjective judgement? edit

Automated analysis isn’t particularly useful on its own; it’s a tool that is designed to make life easier for the wiki communities. In order to achieve this the outputs of these tools need to be meaningful, and need to be embedded in human-machine processes that our users will act out. How do we fit machines into current workflows or use them to enable new workflows? How do we deal with the issues that will inevitably arise from having a machine take on roles that were once purely human? These are questions we must must answer in order to proceed with augmented product development.

Session notes: https://etherpad.wikimedia.org/p/wmtc18_ml_humans

Conclusion: Human verification is an important theme. Should humans always verify AI decisions? How do AI predictions affect human verification. Humans are good at quality judgments and machines can increase speed. Can we increase fitness of AIs to bring down human labor even more? Humans should ultimately be involved. Really practical processes are important for managing our content. Our communities will want to control our AIs. Feedback is a good strategy for controlling and tracking AI behavior.