Wikimedia Security Team/Security reviews/What we are looking for

The extension security review process is meant to ensure that MediaWiki extensions used by Wikimedia don't degrade the security of Wikimedia wikis or our users.

In this document, any time it is said "A user should not be able to do X", it should be taken to mean "A user should not be allowed to do X unless their account has been given applicable rights that authorize them to do X"

Security Goals

The following is a list of security threats we aim to prevent:

A user should not be able to circumvent MediaWiki's access control mechanisms. The particular model of who is able to do what will vary significantly depending on MediaWiki's config - The goal here is that the extension does not circumvent whatever is specified in MediaWiki's configuration.
- Users should not be able to edit protected pages unless they have the applicable right
- Users should not be able to read deleted/revision deleted/oversighted content
- Users should not be able to take write actions unless they are authorized (e.g. Deleting pages)
- Users should not be able to read private wikis unless authorized
  - Note: This only applies to wikis set up where the wiki is entirely private except for a small number of whitelisted pages. Extensions don't necesarily have to respect complex read access control, as this is not used in Wikimedia, and not officially supported by MediaWiki
A user should not be able to cause or "trick" another user into taking an action without their consent.
- Generally, a user has to actually do the action, for the action to be done. A third party site should not be able to cause a user to take some action just by visiting the third party site.
- XSS
- Clickjacking
- target=_blank (in certain circumstances)
- General phising stuff
- CSRF
The feature must not allow users to gain access to the server
- Shell injection
- SQL injection (sometimes)
- Serialization vulnerabilities
- XML entity expansion (XEE)
- Insecurely configured, or outdated versions of programs on the server
- etc
The feature must not allow a user to obtain private information about another user
- SQL injection
- XSS
- Common forms of private information:
  - IP address of a logged in user
    - XSS, loading external resources (such as external images, fonts, css, etc)
  - Passwords (or hashes)
    - SQL injection, XSS
  - User preferences (Exception: Gender preference)
    - SQL injection, XSS. Feature accidentally making private information public
  - browser tracking - e.g. Don't load facebook like button
  - Deleted (and especially oversighted) content
Features should not cause an undue burden on the community due to spam or vandalism potential. In particularly, features should integrate into MediaWiki's builtin controls against bad content.
- Features should be integrated into AbuseFilter if applicable
- Normal MediaWiki captchas should apply if appropriate
- Features should be integrated into normal MediaWiki content tracking (Special:Contributions, Special:Newpages, etc)
- Must support deleting and oversighting bad content
- CheckUser integration
- Generally do not make wiki "messy" (e.g. Just because something is out of the way, doesn't mean its ok for it to be a massive spam trap)
- MediaWiki's rate limits should apply (if appropriate)
- More generally, it is highly recommended to make new content into wikipages (i.e. With a custom content model) over re-inventing the wheel, to ensure that MediaWiki's content security controls automatically apply.
Features must comply with Wikimedia's Privacy policy and data retention policy
- The security review process is not generally a privacy review process. The main point of this goal is to ensure, that if anything looks "suspicious", that legal is flagged, and that the feature goes through an additional privacy review before being deployed.
- Avoid storing data that isn't shown publicly where possible. This goes doubly for any personally identifiable information
- If IP addresses of logged in users must be stored, they need to be deleted within 90 days
- Users should be warned if their IP address with be publicly exposed (e.g. If you are making an alternative editing interface that anon users can use)
- If the extension does anything with Personally identifiable information, then it should have a separate privacy review from legal. Similarly, if anything in the extension is not "run of the mill" when it comes to privacy, it needs a separate privacy review process from legal.
Features should follow best practises (Even if they are not strictly necessary to reach a security goal)
- Follow MediaWiki coding conventions (The easier it is for everyone to understand the code, the less likely a security issue will be missed)
- i18n messages should be properly escaped
- Don't double escape things
- ">" and "<" must be escaped in attributes even if it is not required in html5 (Some of the Tidy stuff assumes they are escaped in attributes)
- The code should be fairly easy to follow and properly documented
- It is preferred that people use MediaWiki's escaping related classes as opposed to reinventing the wheel.

Checklist

Here's a (potentially non-exhaustive) list of things to check

HTML is properly escaped (No XSS and no near-miss XSS)
i18n Messages are properly escaped
No double escaping or inappropriate escaping
Write actions MUST have CSRF tokens. They should also be POST.
Clickjacking (in particular if you add a write interface to something that's not a Special page. Especially if you add a fancy js interface to the view action of a normal page)
target=_blank must set rel="noopener" if the url is for a different domain. If the URL is potentially untrusted (i.e. user controlled or we don't trust the destination for some reason) then it should set rel="noopener noreferrer". For domains we don't think are evil, just rel="noopener" is sufficient.
External resources must not be loaded (including non-user controlled). No loading fonts from the google CDN. Similarly for CSS and images. An exception may be granted if the user consents, but we would really prefer if this never happened. If you really need to load something from somewhere else on the internet, it would be best to proxy it through our servers to protect user privacy
Under no circumstances can javascript be loaded from an external domain. This includes things like loading jquery from a CDN, but also things like
Code dependencies generally must also be checked (possible exception if they are by "well-known" organizations and not security critical). Dependencies must be deployed in a way where we have an audit trail of which dependency was deployed, and that we know it was the right one. (e.g. Running composer install every time you deploy your code isn't cool. Instead something like the mediawiki/core/vendor.git repo should be used)
Strongly prefer that people use MediaWiki classes for escaping things. Be strongly suspicious of people re-inventing the wheel in security critical code paths
Any time you shell out, it should be properly escaped
- When shelling out, should be sandboxed with firejail if possible
Check for issues with temporary files if used
XML must have external entities disabled. Also be careful with recursive entities. See LINK. If possible, prefer JSON over XML
If using DomDocument to post process XML, check for weird libxml behaviour on html comments in href,src,action and name attributes - phab:T190138
Using YAML discouraged. Use JSON instead if possible. If using YAML, ensure you are loading using the "safe" loading mechanism.
If you are loading JSON to client via ResourceLoader (e.g. mw.config), and if user can control key names of objects, be careful with double underscore keys, particularly if they can set obj.__proto__ (See for example phab:T134719).
SQL must be properly escaped. Strongly prefer using MW IDatabase wrappers over hand-escaping.
New code should avoid serialize() if at all possible (use JSON instead). Check for unserialize() vulnerabilities
Do a quick check to see if it appears to comply with Privacy and data rention policies. If it doesn't or anything suspicious, flag it for legal to check. (Security review is not primarily responsible for checking this, but if we see something suspicious ensure that the feature goes through a privacy review before deployment)
- In particular, check if feature retains IP addresses of user for more than 90 days
- Ensure that user is warned their IP will be public when editing if they are anon
- If the feature involves Personally identifiable info in any way, ensure that feature gets a separate privacy review
- Its strongly preferred that features do not store information privately unless absolutely necessary
Ensure that feature integrates with MediaWiki's antivandlism and antispam controls. It is strongly preferred if the feature allows users to create content, that the feature uses ContentHandlers instead of a complex special page system
If feature involves regexes, make sure any user input is run through preg_quote() (unless the user is asked to provide a full regex)
Do a quick check for scary DB queries, and if so flag for DBAs to do a database review [TODO: Need to sort out with DBAs if they actually want this? Some people suggest maybe it should be flagged for performance team instead]. Some red flags include
- looking at more than 1000 rows during a web request (especially if its a large filesort). Exception, ok in QueryPage class if the query page is cached
- Doing large number of writes in maintenance script with waiting for slaves
- Doing large number of writes in web view. (Should be done via job queue instead)
- Doing any writes during a GET request (should be a POST request instead or pushed to a job queue)
If feature does scary performance things (Like totally disabling cache) flag to performance team to do a performance review [The security review is not meant to generally check performance, just if we see something that looks overly suspicious, flag it for the other team]
Check to make sure user cannot trigger an overly large amount of resource usage (For example, a single edit should not cause a million db rows to be written as that's a DOS. Similarly for file system usage).