Wikimedia Technical Conference/2018/Session notes/Architecting Core: stand-alone services
Slides that were used to guide this session are available on Commons.
Goals for the session (see slide for full text)
editA specific set of criteria for determining whether functionality goes in MW or a standalone service. In essence, the outcome of this session will be an RFC, comprising the criteria, requirements, and expectations for MediaWiki functionality that is provided in the form of a standalone service.
Definition of a standalone service (see slide for full text)
editFor purposes of this session, a standalone service has the following properties.
- Business logic in separate runtime from MW
- Interacts with MW via some remote mechanism
- API
- Queue
- XHR
- ?
- Does not directly access MediaWiki's data store
- May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension
Exercise 1
editQuestion 1: What properties make functionality a candidate for separation in to a separate service?
- Async
- Elevated security need
- State context independency
- 3rd party library exists (potentially in another language or in another form that makes integration in to MediaWiki difficult)
- Excessive resource needs
- Independently useful and/or can be replaced with something off the shelf
- Better lang or framework exists for solving the problem
- Independent scalability concerns
- Different ownership models/autonomy/rate of change
- Used to triage MW or fix it
- Need to ship quickly
Question 2: What properties disqualify functionality from separation in to a separate service?
- Require direct MediaWiki DB access
- Easy to do in the context of MediaWiki (using existing classes, for example), difficult to do outside of the context of MediaWiki.
- Too small to justify separation overhead
- Chattiness with the MediaWiki api
- Synchronous
- Needs extensibility by MW features/extensions
Exercise 2
editQuestion 1: What existing MediaWiki functionality is provided by standalone services?
- Parsing (Potentially quick/large wins available through re-integration in to MediaWiki.)
- Thumbnailing
- ORES
- cp-jobqueue
- Eventstreams
- Map tiles
- Recommendation
- Search
- MCS (Mobile content service)
- Restbase (caching / routing)
- Citations
- Mathoid
- Graph rendering
- Translations
- WDQS
- Analytics
- Routing (Potentially quick/large wins available through re-integration in to MediaWiki.)
- CDN
Question 2: What existing MediaWiki functionality could be provided by standalone services?
- A/B Testing
- Job Queue
- Server Side Rendering
- Maps
- Inter-Service Discovery/Routing
- Users and Auth
- Echo Notification
- URL Routing
- l10n/i18n
- (Some) special pages
- Edge purger
- Media handling & transcoding
- URL shortening
- Reading lists
- watch lists
- Revision service
- Parser
Exercise 3
editQuestion 1: What technical/architecture requirements should apply to all standalone services?
- Minimize data collection
- OSI licensed
- Respect GDPR and other applicable data privacy frameworks
- Must do a thing
- Should not be redundant with other services
Question 2: What additional requirements should apply to standalone services in Wikimedia production?
- SLIs/SLOs
- WMF-compatible monitoring
- Has a privacy policy and policy practices that are compatible with WMF privacy policy
- Uses Wikimedia deploy tooling
- Has passed WMF Security review
- Uses a language and toolset that have been approved by TechCom
- Has an owner
- Has Runbooks
- Is licensed under an OSI-approved Open Source license
- Has WMF compatible structured logging
- Swagger specs
- Fault tolerant
- Multi data center
- Backups
- Pinned/Pinnable dependencies
- Horizontal scalability
- Documentation
- Trusted upstream asset chain
- Performs sufficiently for Wikipedia use cases
- Has users (or a plan to get users)
Question 3: What additional requirements should apply to standalone services distributed for 3rd party use?
- Easy to install
- Versioned (semver) - Compatible with supported MW releases (LTS)
- Easy to upgrade and to extend
- Public docs on install, upgrade
- config outside of code
- Operationally independent of wikis
- Open source - usable, accepts patches, etc
- Small footprint
- Public security advisories
- Support channel