Requests for comment/Reducing image quality for mobile

core patch 119661 merged

Request for comment (RFC)
Reducing image quality for mobile
Component General
Creation date
Author(s) Yurik, MaxSem
Document status implemented
See Phabricator.

Rationale

edit

Many mobile devices with low bandwidth and slow processor could benefit from showing JPEG images with reduced quality. Such images would transmit much faster and may require less memory to process. This proposal adds a quality control feature to the thumbnail image generation, and can be used by extensions whenever the benefit of the lower bandwidth outweighs the need of high quality.

  • Quality vs pixel size: Image bytesize can be reduced by lowering both the quality and the pixel size of the picture. Yet, unlike quality, the pixel size changes the overall layout of the page, and should depend exclusively on the size and DPI of the screen, whereas lowering quality does not affect general page layout, yet brings very significant bytesize reduction. Thus, the code that has domain knowledge about screen size and layouts (e.g. Mobile Ext) could choose to reduce pixel size, whereas code that knows about network infrastructure (e.g. Zero) could additionally reduce quality.
  • Why JPEG: unlike PNG, JPEG can be easily compressed without changing the thumbnail size specified by the page authors
  • Target devices: this RFC mainly focuses on mobile market as it tends to have lower bandwidth constrains, but since this is a generic change in core, it could also be used for desktop optimization.

Possible approaches

edit

All small images - We could reduce quality of all smaller thumbnail images (e.g. if height or width is less than 300px).

PROs - same number of thumbnail images, smaller storage requirements, very small code change, no change to URL schemas, no need for JavaScript DOM manipulation.
CONs - everyone is affected, people with good connection and good screens will get worse experience

Varnish magic - Varnish could make the decision to serve different backend images for the same image URL based on if the request is from Zero network. Varnish would rewrite the thumbnail URL by inserting the quality parameter (e.g. "-q30").

PROs - No JavaScript DOM manipulation, no quality degradation for non-mobile users
CONs - Complex Varnish heuristic, varnish must know about thumbnail image URL structure, Varnish might not have as much data to make informed decisions on the network quality, JavaScript might need to send in a cookie to indicate that the user always wants high or low quality images (or change image URLs with an extra URL parameter)

JavaScript magic Zero extension does DOM post-parsing processing, replacing <img> source URL with the "-q30" parameter. JavaScript may decide to change URLs back to the regular high quality.

PROs - Most flexible approach, does not require any Varnish infrastructure changes
CONs - JavaScript DOM manipulations might conflict if more than one extension tries to do them.

Proposal - Core

edit

Extend image URL syntax

edit
This part has been implemented as patch 119661 and needs a review

The production backend generates requested image if file is missing (404). The url is parsed with regex to extract the desired image width. In order to pass quality reduction parameter, we need an extra value "-qNNN". The value is optional, thus keeping the existing image cache intact.

//upload.wikimedia.org/wikipedia/commons/thumb/3/33/image.jpg/100px-q30-image.jpg

This exact approach has already been used for SVG's "language" parameter and for DjVu's "page" parameter.

Quality parameter in Wiki markup

edit
There has been some objections to this section, so it might need to be either reworked or removed

Add a "quality" parameter to the image link wiki markup to specify desired quality reduction.

[[file:image.jpg|100px|q=30]]

Above would render image.jpg with image quality set to 30% and width scaled to 100px. This parameter might be used by various template authors to substantially reduce thumbnail file size.

Technical details

edit

There are currently several workflows for image link parsing, assuming the above wiki markup and generated HTML <img src=".../image.jpg/100px-q30-image.jpg" />:

404 enabled
markup → HTML → browser requests image → image is 404 and rendered on the fly using URL as parameters
404 disabled
markup → HTML & image file is created based on markup params → browser requests existing image
remote repo
markup → server calls remote repo via API → (TBD - still investigating)

Mobile would not want to alter parser's output, as that would fragment parser cache varied on the type of the useragent. In order for mobile or zero to change quality, they can change image URLs via javascript or DOM post-processing, which only only work on 404-enabled site (WMF sites), but will not work on simple installations where 404 is not intercepted. I haven't figured a simple way to solve it, nor am I certain that it needs solving.

Proposal - Zero

edit

Most of Zero network users operate older devices, frequently using slower networks. For Zero users, we propose server DOM rewrite (already being done for external links and some images). The rewrite would remove the srcset img attribute and change the image src from the default .../image.jpg  ⇒  .../image.jpg/q30-image.jpg. All non-js devices would keep using low quality images, while smart phones would use JavaScript to convert q30 images back to default based on locally set user preferences, carrier settings and network quality (TBD). This JavaScript should be done before the HTML is fully loaded as to prevent downloading of both the low and high quality images.

Usage Assessment

If Zero replaces all images for all pages except File: namespace, we are theoretically looking at creating a copy of every image used. Yet, please keep in mind that most images used on Wiki pages have been significantly scaled down from the original (images tend to be used in smaller frames), and reducing their quality to 30 should make file size even smaller - by my random sampling of several wiki images - at least 30-40% of the original size. The images tend to be 5-10KB, which is 100-200GB absolute maximum if every file on commons (20.5 millions) are actually converted.