Library infrastructure for MediaWiki/Initial project pitch

This project has been chosen as a Wikimedia Engineering Top Priority project for FY2014-15, Q2.

What we want to accomplishEdit

Our long term goal is to make a MediaWiki deployment an application that is composed of many small purpose-built libraries (internally and/or externally developed) with interfaces that allow individual libraries to be exchanged for others. Using this flexibility, allow easier tuning of major features for various use-cases and easier development of new large features by composition. For example, a (far) future project might replace storage of revisions as exists now with some system that has better performance characteristics for the Wikipedia wiki farm simply by configuring an alternate implementation of the revision storage library rather than including two implementations and branching logic in the MediaWiki codebase.

Why do this?Edit

  • Make life better for new (and experienced) developers by organizing the code into simple components that can be easily understood.
  • Reverse inertia toward ever expanding monolithic core by encouraging developers (in core) to develop their work as reusable modules with clearly-defined interfaces
  • Start making true unit testing of core viable by having individually-testable units
  • Provide an interim step on the way to service-oriented architecture in a way that is useful independently of that goal
  • Encourage reuse and integration with larger software ecosystem. Done correctly, this will provide a useful means of expanding our development capacity through recruitment of library authors eager to showcase their work on a top 10 website.
  • Share our awesome libraries with others and encourage contributions from them even if they aren't particularly interested in making our sites better.

Near-term goalsEdit

  • Show a path to decomposition of MediaWiki into smaller pieces.
  • Introduce the wider PHP development community to years of battle tested code contained in MediaWiki as reusable libraries that can be added to any PHP project.

Three-month goalEdit

  • Establish the precedent for how libraries should be incorporated into MediaWiki by replacing existing functionality with an existing library
    • Primary candidate: Get structured logging patches finished and merged (Structured Logging RFC)
      • Uses PSR-3 logging interface and Monolog both via Composer
      • Start people converting from wfDebugLog to direct use of PSR-3 interface
      • Add configuration for WMF cluster that uses Monolog to format and ship logs to logstash without using udp2log
  • Establish the precedent for how libraries should be used by demonstrating the split of at least one library out of core
    • Candidate #1: ResourceLoader
      • This is an obvious choice because the Editing team would like it to be independent of core and it would server as a good example of the quality of implementation and design available in the larger MediaWiki codebase.
    Preliminary investigation of ResourceLoader quickly found that it is entangled with other MediaWiki components (Message, DB abstraction, caching, various wf* methods) to an extent that makes extracting it completely in this quarter unlikely. We would like to chip away around the edges of it as time permits but cannot reasonable commit to making this a deliverable for Q2. --BDavis (WMF) (talk) 22:30, 21 October 2014 (UTC)
    • Candidate #2: Extract Profiler into a library that can be used by anyone who wants to instrument a PHP application stack
      • Use Profiler as a test case for developing a full life cycle plan for extracted libraries
      • There is a lot of code that is reusable save for wfDebug / WfProfile calls and could otherwise be extracted
    Discussion within the MediaWiki Core team points to elimination of the sort of explicit profiling system that we have built in Profiler as a more noble goal. The lead candidate for a replacement would be xhprof which is available as a PECL extension for use with PHP5 and is built into HHVM. Various members of the MediaWiki Core team are interested in working on this as a part-time/free-time project. --BDavis (WMF) (talk) 22:30, 21 October 2014 (UTC)
    • Candidate #3: CSSJanus
      • The CSSJanus project has been started by Timo to support (especially bug fixes) and promote use of the javascript and PHP ports of the original implementation by Google (abandoned now). This is a great example of our community spontaneously doing the things that this project seeks to make easier and more successful. Work for the Q2 project would include removing the copy of CSSJanus from includes/libs and replacing it with the Composer managed package. Additionally we would document the process of doing this extraction and publicise it to the MediaWiki and PHP communities. The goal would be to produce a rough project plan and process template for additional extractions to be done by employees and volunteers.
    • Other candidates:
      • CSSMin
      • JavaScriptMinifier
      • CLDR parser
      • HashRing
      • Aaron's UUID generator
      • Zip directory reader
      • IPSet
      • PHP JSON sanitizer and pretty printer
      • HTMLFormatter
    • General considerations:
      • Where do we put the library in git/gerrit?
      • Where do we document the library?
      • Where do we track bugs?
      • How do we promote the use of the library
  • Introduce some dependency management system to MediaWiki which can be used to configure complex objects
  • Complete infrastructure for using composer to manage internal and external dependencies (RFC for Composer at WMF)
  • TODO: strip this down to something that we're much more confident can be completed by the end of the 2014 calendar year

Out of scopeEdit

Complete breakup of MediaWiki core into components

The end goal is not to eliminate MediaWiki but instead to make MediaWiki the platform for creating massive multi-user versioned data-based projects, and make it a platform that is flexible enough to extend in novel ways.

Important linksEdit