Requests for comment/TitleValue

This request for comments introduces a new class named TitleValue to take over many uses of the current Title class.

Request for comment (RFC)
TitleValue
Component General
Creation date
Author(s) Daniel Kinzler
Document status implemented
The proposed TitleValue class is intended to represent the title of a wiki page and nothing more. It does not support interwiki links, permission checking, or even normalization. This is by design; see below.

Background edit

During the MediaWiki architecture discussion at Wikimania 2013, the merits of using value objects over active records were once more discussed. The consensus was that instead of making a fundamental decision and planning major refactoring, we will try out the idea on a part of the codebase where it appears to be beneficial. The idea is to continue the architecture discussion once we have collected some experience with the new approach.

A quick primer about value objects:

  • Methods in value objects have no side effects.
  • Value objects can easily be serialized and stored.
  • Value objects can be instantiated easily and efficiently.
  • Value objects represent the value, and operations on the value, but not operations with the value.
  • Value objects are typically, but not necessarily, immutable.
  • Value objects follow the principle "hair should not know how to cut itself". If you want to use a value in an operation, you need a service object that operates on the value.

Motivation edit

The old Title class is huge and has many dependencies. It relies on global states for things like namespace resolution and permission checks. It requires a database connection for caching.

This makes it hard to use Title objects in a different context, such as unit tests. Which in turn makes it quite difficult to write any clean unit tests (not using any global state) for MediaWiki since Title objects are required as parameters by many classes.

In a more fundamental sense, the fact that Title has so many dependencies, and everything that uses a Title object inherits all of these dependencies, means that the MediaWiki codebase as a whole has highly "tangled" dependencies, and it is very hard to use individual classes separately.

Instead of trying to refactor and redefine the Title class, this proposal suggest to introduce an alternative class that can be used instead of Title object to represent the title of a wiki page. The implementation of the old Title class should be changed to rely on the new code where possible, but its interface and behavior should not change.

Architecture edit

The proposed architecture consists of three parts, initially:

  1. The TitleValue class itself. As a value object, this has no knowledge about namespaces, permissions, etc. It does not support normalization either, since that would require knowledge about the local configuration.
  2. A TitleParser service that has configuration knowledge about namespaces and normalization rules. Any class that needs to turn a string into a TitleValue should require a TitleParser service as a constructor argument (dependency injection). Should that not be possible, a default TitleParser can be obtained from a global registry.
  3. A TitleFormatter service that has configuration knowledge about namespaces and normalization rules. Any class that needs to turn a TitleValue into a string should require a TitleFormatter service as a constructor argument (dependency injection). Should that not be possible, a default TitleFormatter can be obtained from a global registry.
  4. A PageLinkRenderer service that has configuration knowledge about the base URL for links (which would replace $wgArticlePath) and access to a TitleFormatter. Any class that needs to generate links to wiki pages should require a TitleFormatter service as a constructor argument (dependency injection). Should that not be possible, a default TitleFormatter can be obtained from a global registry.

So far the basic design. It can be extended and elaborated in several ways, for example by defining:

  • a WikiLink class with subclasses for internal links, interwiki links, and external links.
  • a UserPermissions service that can check a user's permissions with respect to a TitleValue.
  • PageStore and RevisionStore services for looking up whether a title exists, loading the latest revision, etc.
  • ...

Implementation edit

Below are interfaces/stubs for the proposed classes:

	class TitleValue {

		protected $namespace;
		protected $dbkey;
		protected $fragment;

		public function __construct( $namespace, $dbkey, $fragment = '' ) { /* ... */ }

		/**
		* @return int
		*/
		public function getNamespace() { /* ... */ }

		/**
		* @return string
		*/
		public function getFragment() { /* ... */ }

		/**
		* Returns the title's DB key, as supplied to the constructor,
		* without namespace prefix or fragment.
		*
		* @return string
		*/
		public function getDBkey() { /* ... */ }

		/**
		* Returns the title in text form,
		* without namespace prefix or fragment.
		*
		* This is computed from the DB key by replacing any underscores with spaces.
		*
		* @note: To get a title string that includes the namespace and/or fragment,
		*        use a TitleFormatter.
		*	

		* @return string
		*/
		public function getText() { /* ... */ }

		/**
		* Creates a new TitleValue for a different fragment of the same page.
		*
		* @param string $fragment The fragment name, or "" for the entire page.
		*
		* @return TitleValue
		*/
		public function createFragmentTitle( $fragment ) { /* ... */ }
	}
	/**
	* Service object for parsing and normalizing page titles
	*/
	interface TitleParser {
	
		/**
		* Parses the given text and constructs a TitleValue. Normalization
		* is applied according to the rules appropriate for the form specified by $form.
		*
		* @note this only parses local page links, interwiki-prefixes etc. are not considered!
		*
		* @param string $text the text to parse
		* @param int $defaultNamespace namespace to assume per default (usually NS_MAIN)
		*
		* @throws MalformedTitleException If the text is not a valid representation of a page title.
		* @return TitleValue
		*/
		public function parseTitle( $text, $defaultNamespace );
	}
	/**
	* A title formatter service for MediaWiki.
	*/
	interface TitleFormatter {

		/**
		* Returns the title formatted for display.
		* Per default, this includes the namespace but not the fragment.
		*
		* @note Normalization is applied if $title is not in TitleValue::TITLE_FORM.
		*
		* @param int|bool $namespace The namespace ID (or false, if the namespace should be ignored)
		* @param string $text The page title
		* @param string $fragment The fragment name (may be empty).
		*
		* @return string
		*/
		public function formatTitle( $namespace, $text, $fragment = '' );

		/**
		* Returns the title text formatted for display, without namespace of fragment.
		*
		* @note: Only minimal normalization is applied. Consider using TitleValue::getText() directly.
		*
		* @param TitleValue $title the title to format
		*
		* @return string
		*/
		public function getText( TitleValue $title );

		/**
		* Returns the title formatted for display, including the namespace name.
		*
		* @param TitleValue $title the title to format
		*
		* @return string
		*/
		public function getPrefixedText( TitleValue $title );

		/**
		* Returns the title formatted for display, with namespace and fragment.
		*
		* @param TitleValue $title the title to format
		*
		* @return string
		*/
		public function getFullText( TitleValue $title );

		/**
		* Returns the name of the namespace for the given title.
		*
		* @note This must take into account gender sensitive namespace names.
		* @todo Move this to a separate interface
		*
		* @param int $namespace
		* @param string $text
		*
		* @throws InvalidArgumentException
		* @return String
		*/
		public function getNamespaceName( $namespace, $text );
	}
	/**
	* Represents a link rendering service for %MediaWiki.
	*/
	interface PageLinkRenderer {

		/**
		* Returns the URL for the given page.
		*
		* @todo expand this to cover the functionality of Linker::linkUrl
		*
		* @param TitleValue $page The link's target
		* @param array $params any additional URL parameters.
		*
		* @return string
		*/
		public function getPageUrl( TitleValue $page, $params = array() );

		/**
		* Returns an HTML link to the given page, using the given surface text.
		*
		* @todo expand this to cover the functionality of Linker::link
		*
		* @param TitleValue $page The link's target
		* @param string $text The link's surface text (will be derived from $page if not given).
		*
		* @return string
		*/
		public function renderHtmlLink( TitleValue $page, $text = null );

		/**
		* Returns a wikitext link to the given page, using the given surface text.
		*
		* @param TitleValue $page The link's target
		* @param string $text The link's surface text (will be derived from $page if not given).
		*
		* @return string
		*/
		public function renderWikitextLink( TitleValue $page, $text = null );

	}

Obtaining Service Instances edit

Objects that need one of the services defiend above, such as a SecialPage would, should obtain an instance of that a service ideally be requiering it (or a builder or factory for it) as a constructor argument. Should this not be possible (like it isn't for SpecialPage objects), the service object can be created from global state or fetched from a global registry (see below). Thisshould be done either in the object's constructor, or in a getter that performs lazy initialization.

In addition, the "client" object (in our example, the SpecialPage) should provide a setter for the service, so it can be overwritten for testing even if injection as a constructor argument is not possible.


Global Registry edit

For getting access to the TitleParser, TitleFormatter, and PageLinkRenderer objects, a global registry object is proposed. This should be used only where dependency injection is not possible, such as static hook functions or where there is no control over constructor calls. In general, explicit dependency injection as a constructor parameter is preferred.

	/**
	* Global service registry. Only use in static context! Access
	* to registry objects implies a lot of dependencies, so it 
	* should be generally be avoided and restricted to the edges 
	* of an application.
	*/
	class ServiceRegistry {
		
		public static function getRegistry() { /* ... */ }
		
		public function getTitleParser() { /* ... */ }
		
		public function getTitleFormatter() { /* ... */ }
	}

There are two major use cases for using the registry singleton:

Firstly, gaining access to the service objects in a static context, such as a hook handler function. Here, the singleton would be used to get the service objects that then get injected into an object that implements the actual logic that should be attached to the hook:

 public static final onSomeHook( $stuff ) {
     $someService = Registry::getDefaultInstance()->getSomeService();
     $anotherService = Registry::getDefaultInstance()->getAnotherService();
     
     $handler = new MySomeHookHandler( $someService, $anotherService );
     $handler->onSomeHook( $stuff );
 }

Secondly, support for legacy code. For instance, Title::getLocalURL() should be changed to use a TitleFormatter, but there is no good way to inject a TitleFormatter into a Title object. So, Title::getLocalURL() would use the global registry instance to get the service.

As an alternative, it would be possible to use the RequestContext class as a registry, or make the registry available from RequestContext. But that would mean that all code that uses RequestContext directly or indirectly (which is pretty much everything in MediaWiki) would then needlessly also depend on the new services. This would make the problem of entangled dependencies worse instead of improving it.

Instead, use of the registry object should be restricted to a few isolated places. The registry should not be passed around at all. Anything that needs a service should ideally ask for that service explicitly in the constructor.

Usage edit

In MediaWiki core, Title objects are often used where a reference to a wiki page is needed. However, because they are so heavy weight, they drag in a large amount of dependencies and make testing the respective code quite hard. TitleValue could be used in places where only a reference to a wiki page is needed. For example:

  • in Revision, to represent the title of the page the revision belongs to.
  • in the Linker, specifying which page to link to.
  • in WatchItem, specifying which page to watch.
  • etc.

Each of these classes needs to perform some operation on the title that TitleValue itself does not support, like getting the DB key form, or checking whether the page exists. Service objects for performing these tasks would need to be injected. This may seem troublesome, but is actually an advantage: it means that we can control how that class checks whether a title exists, and can provide a dummy method for use in tests.