Manual:Developing libraries

These are guidelines for creating, publishing and managing libraries based on code originally developed as a part of MediaWiki or a related Wikimedia project. Some of these are PHP-specific, but others are general and apply to all languages.

Rationale

There is a growing desire to separate useful libraries from the core MediaWiki application and make them usable in non-MediaWiki based projects. This "librarization" of MediaWiki is thought to have several long term advantages:

  • Make life better for new (and experienced) developers by organizing the code into simple components that can be easily understood.
  • Reverse inertia toward ever expanding monolithic core by encouraging developers (in core) to develop their work as reusable modules with clearly-defined interfaces
  • Start making true unit testing of core viable by having individually-testable units
  • Provide an interim step on the way to service-oriented architecture in a way that is useful independently of that goal
  • Encourage reuse and integration with larger software ecosystem. Done correctly, this will provide a useful means of expanding our development capacity through recruitment of library authors eager to showcase their work on a top 10 website.
  • Share our awesome libraries with others and encourage contributions from them even if they aren't particularly interested in making our sites better.

In order for this strategy to be successful, these libraries need to develop a life of their own, independent of MediaWiki. Therefore, it will be important for library authors to have some latitude and independence in making the library successful. The policies surrounding these should largely be dictated by the primary maintainer of the library, and the choices made may diverge from MediaWiki core. Note that that primary maintainer may not be the original author of the majority of the code (or even any of it), and will have latitude to make independent decisions about the library. The amount of latitude a maintainer gets is proportional to the amount of commitment, credibility and hard work with respect to the library they maintain.

Repository hosting guidelines

The Wikimedia Foundation will likely invest in tooling that makes code review transfer from GitHub to internal review tools (Phabricator) in the future. Eventually this should eliminate the difference between hosting the primary git repository with the Wikimedia Foundation or GitHub. In the near term, Gerrit hosting is the default hosting option except in cases where an effort is being made to attract a significant portion of contributions from external developers.

Don't host under an individual user's GitHub account. It complicates code review based on pull requests for the repository owner and makes management of the repository by a shared group difficult. Note: this doesn't mean it has to be hosted under the Wikimedia account specifically; in fact, it can be more convenient to have the project under a different organizational account specific to the project, like CSSJanus.

Repository naming guidelines

Probably varies somewhat based on hosting location. Follow the local conventions as far as reasonable and possible. Do not add "wikimedia-" prefixes to repository names. On Gerrit the Git repository should be named "mediawiki/libs/<name>".

Issue tracking guidelines

  • Phabricator
  • GitHub

Your library's issue tracking should match your git hosting, in order to reduce the friction of matching commits and pull requests to issues and vice versa. Thus Gerrit-hosted repos should Wikimedia's Phabricator instance and GitHub-hosted repos should use GitHub's built-in issue tracker.

Code review guidelines

Project code review should use the tool most closely associated with the primary git hosting. Regardless of choice of hosting platform, pre-merge code review and unit testing are strongly encouraged. Blatant self-merge behavior should be seen just as distasteful on GitHub as it generally is in Gerrit.

If primary hosting is via GitHub, changes should be proposed via pull requests rather than direct push to master. In most cases the pull requests should originate from a fork of the repository associated with the user's own GitHub account. GerritHub is a Gerrit powered code review service that can be used with GitHub hosted repositories for projects that want to use Gerrit but for some reason do not want to host with Wikimedia.

Code style guidelines


We encourage MediaWiki style or PHP PSR-2, but the most important things are clarity, consistency, and best likelihood of adoption by the library's developer community.

When creating a new repository, you MUST choose a coding style standard and enforce it with CodeSniffer. Also point to the style guide in the project's README file.

For the MediaWiki coding style, use the mediawiki/mediawiki-codesniffer package.

Be sure to use the latest version of mediawiki-codesniffer (check packagist.org). Don't use a wildcard version, upgrades must be done explicitly to prevent a non-passing state of the master branch.

For the PSR-2 coding style, use PHP CodeSniffer.

Automated testing guidelines

Both pre and post-merge testing should be used. The testing should include basic lint, unit tests and coding style checks.

Gerrit

Projects hosted in Wikimedia Gerrit should use Jenkins. Use the composer test entry point. See Continuous integration/Entry points#PHP for details.

Once created, be sure to also enable the post-merge publisher for automatically generating documentation and code coverage (e.g. for IPSet, https://doc.wikimedia.org/IPSet/ and https://doc.wikimedia.org/cover/IPSet/). These jobs can be enabled for your project in the integration/config repository (example commit: Gerrit change 227615). Alternatively, file a task in the #ci-config project.

Packagist guidelines

PHP libraries should be published on packagist.org. When adding new packages:

Instructions if you have the required permissions (you are an owner in the Wikimedia GitHub organization and have access to the mediawiki and wikimedia Packagist accounts):

  1. Use the github.com mirror as the git url, this will allow composer to download zipballs which can be cached.
  2. Submit the repo to Packagist.org:
    • Log in to Packagist.org with the wikimedia account and submit the GitHub url.
  3. Ensure both the "mediawiki" and "wikimedia" accounts are maintainer of the package.

The following people have access to the different accounts in case a package needs to be updated for any reason:

Packages distributed via Composer and Packagist should also include a .gitattributes configuration file in their git repository which excludes files such as tests and examples which are not needed at runtime from the generated package. The wikimedia/RelPath library is a good example:

/.* export-ignore
/Doxyfile export-ignore
/composer.json export-ignore
/phpunit.xml.dist export-ignore
/build/ export-ignore
/tests/ export-ignore

Note that composer.json should be excluded which is a bit counterintuitive. The actual composer.json matching a given release of your library will be generated by Packagist and sent outside of the actual zip package.

License guidelines

For almost anything that gets extracted from MediaWiki, it's likely that it will need to be GPLv2 (or later). All contributors must agree to a change of license from GPLv2+ in order for anyone to change the license (other than changing to GPLv3). The license of the new library needs to remain clearly marked in the headers of the code, and the full license file (typically called "LICENSE" or "COPYING") must be carried into the new project.

For a library consisting entirely of new code any license complying with the Open Source Definition is likely to be acceptable, but the MIT, GPLv2+ and Apache License 2.0 licenses may be the easiest to adopt. Include both contributor copyright grant clauses which are important for ensuring the integrity of the project's code base. The Apache2 and MIT licenses are seen as more permissive, in that it allows derivative works to include a separate license for new contributions.

Documentation guidelines

Readme

Any library should have a README.md file that describes the project at a high level. This file should be formatted using Markdown syntax (e.g. for headers, links, and code blocks) which is commonly supported by the majority of git browsers while also being human readable.

A good README.md will include:

  • A brief description of the primary use case the library solves
  • How to install the library
  • How to use the library (prose and brief code example if possible)
  • How to contribute
    • License name (GPL-2, ...)
    • Where to submit bugs
    • Where to submit patches
    • Link to coding standard
    • How to run tests
    • Where to see automated test results

Code documentation

Add a pipeline for generating code documentation. Common choices are:

  • Doxygen (for PHP)
  • JSDuck (for JavaScript)
  • Sphinx (for Python)
  • Yard (for Ruby)

View existing libraries' documentation at https://doc.wikimedia.org for an example of what these look like. See Continuous integration/Entry points for how to configure these.

On-wiki documentation

Libraries should have a brief page on mediawiki.org documenting their purpose, history, and related links. Examples (complete list):

Bootstrapping a new library

New libraries require a lot of different common files to boostrap the library (.gitattributes, composer.json, Doxyfile, phpunit.xml, etc.). To make this easier there are tools that will bootstrap the extension for you.

Common combinations

  • Hosted in Gerrit and published to Packagist under the wikimedia namespace
  • Hosted in the Wikimedia GitHub organization and published to Packagist under the wikimedia namespace
  • Hosted in another GitHub organization and published to Packagist as something other than wikimedia or mediawiki (e.g. cssjanus).

When hosting is in Gerrit the project should run as any other "typical" Wikimedia sponsored project with Gerrit code review, Phabricator bug tracking and wiki documentation.

If hosted at GitHub the project could reasonably choose to do code review via pull requests and host the bug tracker on GitHub. This choice should be considered carefully on a project by project basis as divergence of code review and issue tracking tools from the larger Wikimedia community has some disadvantages:

  • The Bugwrangler will not be expected to monitor your project for issues.
  • Members of the MediaWiki developer community will probably file bugs against your product in Phabricator anyway.
  • Moving a bug report between your project and MediaWiki will be a more involved process.

These downsides may diminish over time if better bots can be created to integrate between the two environments.

Hosting under an independent GitHub organization makes sense for certain projects (CSSJanus, Wikidata, Semantic MediaWiki, ...) where an effort is being made to develop an independent and sustaining community for the project. It is especially reasonable in the case of a library like CSSJanus that is attempting to establish a cross-community and cross-language standard set of tools where only a portion of the tools overlap with the Wikimedia universe.

Transferring an existing GitHub repo to Wikimedia

  • File a ticket in the Phabricator Librarization project requesting transfer.
  • When contacted, transfer the repository to the responding Wikimedia GitHub administrator.
  • The administrator will move the project to Wikimedia and give you access.

Tips for extracting a library

The details of extracting a library will vary depending on the code being extracted and its current entanglement with other MediaWiki specific classes. In the best case, the code you want to extract is already contained in the includes/libs directory of mediawiki/core.git and thus completely unencumbered. It is suggested that code which is not in this state is progressively updated and refactored until that is the case.

Once you get all the code into includes/libs, things become a little more straight forward:

  1. Create a new project following the rest of the guidelines in this RFC.
  2. Import the code from includes/libs into your new project[1].
    • It may be possible to use git filter-branch to extract a copy of the files with commit history preserved, but that is not currently considered a prerequisite for extraction. It should be sufficient to take the current head of the files into the new project and provide documentation of the file provenance in the new project's README[2].
  3. Create a proper composer.json file that follows best practices for the project and publish to Packagist[3].
  4. Tag the repository to create a stable release.
  5. Propose a change to mediawiki/vendor.git importing the stable release of your new project[4]. See Manual:External libraries for additional details.
  6. Propose a change to composer.json in mediawiki/core.git to require the stable release of your new project[5].

It may also be necessary to introduce shim classes mediawiki/core.git to provide a backwards compatible bridge between your extracted library and the existing MediaWiki code base. The CDB library did this to provide backwards-compatible class names which did not require the use of the new Cdb\ namespace[6].

Releasing a new version of a library

  • When determining the next version number, stick to semantic versioning
  • Make sure the code is ready for release, e.g. update the readme and changelog files (example).
  • GPG sign the tag by adding -s to the git tag command:
    git tag -s v2.1.0 -m "Signed v2.1.0 release"
    
    You may also want to provide the changelog in the annotated tag notes.
  • git push --tags to submit the tag.
  • Depending on the language of the library, you may at this point need to publish the new version to a registry, e.g. with npm publish or

python3 -m twine upload dist/*. Note that this is not necessary with PHP, as Packagist is configured to listen for new git tags and auto-publish.

  • Update any on-wiki documentation.
  • Find all the components using the library, via LibUp: https://libraryupgrader2.wmcloud.org/library?branch=main
  • If needed, update the library version in mediawiki/vendor and the relevant production components using the library, and ensure tests pass and code works as before.
  • Preferably update the version in non-production components.


After the first release

Once the repository is bootstrapped and the initial release has been made. Here's a few next moves to consider:

  • Create a page for the library here on mediawiki.org. See At-ease for a good example. This page should point to:
    • Source code (e.g. git.wikimedia.org, or github.com).
    • Published package (e.g. on Packagist.org or npmjs.org).
    • Issue tracker (e.g. Phabricator workboard, or GitHub Issues).
    • API Documentation (e.g. doc.wikimedia.org).
  • Enter description and URL for GitHub mirror. Especially if it's only mirrored to GitHub, this is easy to forget. Enter a one-line description and enter the url to the mediawiki.org page (if it exists) or else the doc.wikimedia.org page. See https://github.com/wikimedia/cdb and https://github.com/wikimedia/oojs for examples.

RFC information

Request for comment (RFC)
en
Component General
Creation date
Author(s) BDavis (WMF)
Document status implemented
Accepted. Tim Starling (talk) 21:45, 14 Jan 2015 (UTC)

This originated as a request for comment, "Guidelines for extracting, publishing and managing libraries". The decision was to move this page out of the RFC space and improve it as documentation.

See also

References