Wikimedia Technical Conference/2018/Session notes/Architecting Core: stand-alone services

Slides that were used to guide this session are available on Commons.

Goals for the session (see slide for full text) edit

A specific set of criteria for determining whether functionality goes in MW or a standalone service. In essence, the outcome of this session will be an RFC, comprising the criteria, requirements, and expectations for MediaWiki functionality that is provided in the form of a standalone service.

Definition of a standalone service (see slide for full text) edit

For purposes of this session, a standalone service has the following properties.

  • Business logic in separate runtime from MW
  • Interacts with MW via some remote mechanism
    • API
    • Queue
    • XHR
    • ?
  • Does not directly access MediaWiki's data store
  • May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension

Exercise 1 edit

Question 1: What properties make functionality a candidate for separation in to a separate service?

 
Output of exercise 1, question 1
  • Async
  • Elevated security need
  • State context independency
  • 3rd party library exists (potentially in another language or in another form that makes integration in to MediaWiki difficult)
  • Excessive resource needs
  • Independently useful and/or can be replaced with something off the shelf
  • Better lang or framework exists for solving the problem
  • Independent scalability concerns
  • Different ownership models/autonomy/rate of change
  • Used to triage MW or fix it
  • Need to ship quickly

Question 2: What properties disqualify functionality from separation in to a separate service?

 
Output of exercise 1, question 2
  • Require direct MediaWiki DB access
  • Easy to do in the context of MediaWiki (using existing classes, for example), difficult to do outside of the context of MediaWiki.
  • Too small to justify separation overhead
  • Chattiness with the MediaWiki api
  • Synchronous
  • Needs extensibility by MW features/extensions

Exercise 2 edit

Question 1: What existing MediaWiki functionality is provided by standalone services?

 
Functionality already provided to MediaWiki by standalone services. Hearts denote services that are candidates for reintegration in to MediaWiki.
  • Parsing (Potentially quick/large wins available through re-integration in to MediaWiki.)
  • Thumbnailing
  • ORES
  • cp-jobqueue
  • PDF
  • Eventstreams
  • Map tiles
  • Recommendation
  • Search
  • MCS (Mobile content service)
  • Restbase (caching / routing)
  • Citations
  • Mathoid
  • Graph rendering
  • Translations
  • WDQS
  • Analytics
  • Routing (Potentially quick/large wins available through re-integration in to MediaWiki.)
  • CDN

Question 2: What existing MediaWiki functionality could be provided by standalone services?

 
Grid estimating the difficulty and the scale of "the win" from extracting this service in to a standalone service
  • A/B Testing
  • Job Queue
  • Server Side Rendering
  • Maps
  • Inter-Service Discovery/Routing
  • Users and Auth
  • Echo Notification
  • URL Routing
  • l10n/i18n
  • (Some) special pages
  • Edge purger
  • Media handling & transcoding
  • URL shortening
  • Reading lists
  • watch lists
  • Revision service
  • Parser

Exercise 3 edit

Question 1: What technical/architecture requirements should apply to all standalone services?

  • Minimize data collection
  • OSI licensed
  • Respect GDPR and other applicable data privacy frameworks
  • Must do a thing
  • Should not be redundant with other services

Question 2: What additional requirements should apply to standalone services in Wikimedia production?

 
  • SLIs/SLOs
  • WMF-compatible monitoring
  • Has a privacy policy and policy practices that are compatible with WMF privacy policy
  • Uses Wikimedia deploy tooling
  • Has passed WMF Security review
  • Uses a language and toolset that have been approved by TechCom
  • Has an owner
  • Has Runbooks
  • Is licensed under an OSI-approved Open Source license
  • Has WMF compatible structured logging
  • Swagger specs
  • Fault tolerant
  • Multi data center
  • Backups
  • Pinned/Pinnable dependencies
  • Horizontal scalability
  • Documentation
  • Trusted upstream asset chain
  • Performs sufficiently for Wikipedia use cases
  • Has users (or a plan to get users)

Question 3: What additional requirements should apply to standalone services distributed for 3rd party use?

 
Output of exercise 3, question 3 at WMTC 2018
  • Easy to install
  • Versioned (semver) - Compatible with supported MW releases (LTS)
  • Easy to upgrade and to extend
  • Public docs on install, upgrade
  • config outside of code
  • Operationally independent of wikis
  • Open source - usable, accepts patches, etc
  • Small footprint
  • Public security advisories
  • Support channel