Wikimedia Services/Roadmap

2014 / 15

edit

The high-level roadmap from the goals page:

Quarter

Goals

Jul–Sep 2014
  • Implement first iteration of REST API front-end (restface)
    • Alpha deploy to api.wikimedia.org
    • Simple proof of concept implementation of page metadata end point for Parsoid HTML views (redlinks etc)
  • PDF render service deployment
  • First iteration on Rashomon storage service (with versioned blob and queue buckets & lame auth)
    • Deploy & start using it with Parsoid
    • Wrap HTML load/save in restface
  • Citation service
  • Mathoid deploy
  • HTML templating: documentation, in particular KnockOff compiler & PHP implementation
    • Iterate once we have feedback from users & think more about use for content & messages
Oct–Dec 2014
  • Security: Design & implement more intelligent security architecture / authentication solution
  • Efficient page metadata end point for Parsoid HTML page views (redlinks etc)
  • Set up proper Varnish caching and purging for API (w/ help from ops)
  • Packaging / deployment -- Make debian/ubuntu packages for frontend / pdf / rashomon
  • Structured API documentation -- set up frontend
  • Help other teams like Mobile use page-related storage & build data extraction services
Jan–Mar 2015
  • Job queue runner using storage service queues
  • Help with Mobile API / service needs (ongoing)
  • Help with Flow API needs
  • Storage service
    • Think about solution (bucket type?) for link table scaling, in collaboration with platform & ops
    • Implement new bucket types in storage service
  • Prototype HTML content / i18n message templating solution in collaboration with Parsoid & platform
Apr–Jun 2015
  • Implement an efficient CentralNotice end point
  • Iterate on HTML templating solution
  • Multi datacenter operation (w/ ops)
  • Possibly implement link table solution in storage service
  • Cacheable Wikidata API? Echo?
  • Possibly look into HTML diffing service for HTML-only operation

Interdependencies

edit
  • Parsoid depends on Rashomon revision storage & content API
  • VE, Flow, Mobile & platform depend on HTML content & page metadata end points
  • Lots of stakeholders on storage service (platform, features, mobile, dev community)
  • Lots of stakeholders on HTML templating (community, platform, features, mobile)
  • We depend on ops for provisioning, deployment & monitoring

Details on individual projects

edit

REST API front-end (working title: restface)

edit
  • Goal: support high volume with low latency
    • Varnish caching & reliable purging
    • Usually thin wrapper around back-end services; normal case: just load from storage service
      • If missing, ask other services to create data on demand & save back to storage service
  • Consistent REST API with structured API docs

Enable move to native Parsoid HTML5 storage & page views

edit
  • Use static Parsoid HTML5 for all page views
    • HTML5 load / save entry point for use by desktop and Mobile page views, VE, content translation and others
      • To power Mobile skin, apps
      • Improve desktop page view latency for editors (currently 50+% higher median page load times)
    • Page metadata entry point for rendering of red links and other bits currently implemented as server-side content transformations
  • Facilitate additional content derivative end points (e.g. Mobile: section loading, citations, section image urls)

Miscellaneous service end points

edit

API end point design and prototyping support for other teams

edit
  • Example: Help Flow team in the development of a REST API for use by rich front-end, mobile

Backend services

edit
  • See RFC for background
  • Aiming for ability to use this for regular page views Q2
    • Improved page view performance for editors (currently 50+% slower)
    • Reduce load on PHP cluster (HW cost and energy savings)
    • Enables seamless and fast switching from page view to VE, async saving
  • Support for cross-datacenter replication, compression and even load distribution across storage cluster
  • Helps to solve scaling problems in MySQL (revision table, link tables)

Generalization of storage service to support different bucket types

edit
  • Candidate bucket types, roughly by priority: versioned blob, queue, key-value, ordered key-value, counter
  • Features like authentication, TTL

Update & invalidation jobs

edit
  • Ensure that stored data is kept up to date with changes, and front-end caches are invalidated
    • Possibly look into simple HTTP job runner using queue in storage service

Misc backend services

edit
  • Maintain Math render service (Mathoid)

General service infrastructure

edit

Structured API documentation

edit
  • Goals:
    • Machine-readable API specs
    • Browsable documentation & sandbox
    • Auto-generated mock APIs
  • Help establish best practices in declarative API documentation using tools like swagger
  • See this section in the content API RFC

Drive automated service testing

edit
  • Mocking
  • Work with QA & Antoine on containerization
  • Try to leverage API specs

Evolve authentication in collaboration with platform

edit

Deployment and Packaging in collab with platform, ops

edit
  • Drive packaging of services for practical third-party and internal use
  • Leverage packages as much as possible for deployment, DRY
    • Use Puppet for configuration management

HTML content

edit