Talk:Requests for comment/Standardized thumbnails sizes

About this board

YuviPanda (talkcontribs)

I assume this will still be compatible with our current way of specifying thumbnail sizes, except that it will return the closest match? That is, as a client, I do not need to have any knowledge of 'allowed' sizes. If I request a thumbnail for 310px, I should automatically get a 320px thumbnail and some way of knowing it was rounded up. This should be at the thumb scaler level rather than the API level, so that'll allow clients to continue building thumbnail URLs with knowledge of just the filename without adding an extra network request.

Brooke Vibber (talkcontribs)

We're suckers for backwards compatibility so that's probably what we'd end up doing.

Or, if Tim's position prevails we'll keep making arbitrary thumbs, we just won't store them outside of cache. Either way I suspect we've got URL compatibility.

MarkTraceur (talkcontribs)

Hi there,

We've been struggling with a similar issue in the Multimedia team. If we want Extension:MultimediaViewer to scale properly, we should probably bucket our requests for thumbnails to a specified list of sizes. This may be a way to test different solutions to this problem without necessarily figuring out a canonical solution in core.

Basically the idea would be to try bucketing the thumbnail requests to a set of 10-20 predetermined sizes, then either requesting the nearest one above the ideal size and scaling down on the client, or requesting the nearest one below and showing the image in a container with a frame.

If we can be of any use in testing possible solutions, we'd love suggestions as to implementation details.

Brooke Vibber (talkcontribs)

Let's start by just devising some buckets and using them explicitly in MultimediaViewer, then if we find that successful we can start building it into the backend?

The traditional default selectable File: page display sizes appear to be some classic full-screen sizes:

  • 320x240
  • 640x480
  • 800x600 (default)
  • 1024x768
  • 1280x1024 (slightly different aspect ratio)

Some things to consider:

  • do these fit default aspect ratios well?
  • do we also need to bucket display density variants?
  • what about dealing with panoramic images? (very long or tall aspect ratios, work best when requesting a larger total size in the longer dimension and allowing some kind of scroll/panning) Should we set size by the long dimension, or the short dimension ('show horizontal panoramas at 600px high, however wide they are' or 'show vertical panoramas at 800px wide, however tall they are)?
MarkTraceur (talkcontribs)

I'll probably add 1920x1080 just to make sure we're supporting everyone, and use the full-size image for anything above that.

When it comes to panoramic images, I guess we'll need to determine that based on the aspect ratio of the thumbnail on the page - AFAIK it's guaranteed to preserve the original ratio, so that should be OK, but we'll determine whether the ratio is e.g. 3 or 1/3, and if so, use the height or width as the only limiting factor, then enable scrolling.

Good plan :)

Brooke Vibber (talkcontribs)

Be very, very careful about using "full-size" image. Sometimes that's a 4MP or 8MP photo, but sometimes it's a 100MB TIFF file. And of course sometimes it requires rotation for display in browser, or transformation to a JPG, PNG, etc. Generally... never EVER use the original file as anything but a download link, unless the API happens to return you the original file when you ask for a rendered thumbnail...

MarkTraceur (talkcontribs)

I mean, the only time that would happen is if the screen size were bigger than 1920x1080. And that's...unlikely.

I could add one more size that's double that, I guess. Maybe two more (one double, one quadruple) just to be safe.

Jdforrester (WMF) (talkcontribs)

Not that unlikely; I have one of those monitors at home, and yes, I do use it full-screen in 1:1 resolution. :-)

Brooke Vibber (talkcontribs)

You'll have people with non-retina screens in the 2560x1440 range running those big 27-30" monitors.

You'll also have people with retina screens where the screen density bumps you up from 1280x800 or 1440x900 CSS pixels to 2560x1600 or 2880x1800.

Common densities are going to be in the 1.0, 1.5, and 2.0 ranges, but others may appear (especially if you have a funny zoom, etc)

Sharihareswara (WMF) (talkcontribs)

Is this RfC still relevant, or is it superseded by other, more recent RfCs?

Quiddity (talkcontribs)
Tgr (WMF) (talkcontribs)

In short, relevant but outdated. There are conversations about generating some thumbnails on upload and maybe using them to generate the rest of the thumbnails. It's still in the research phase, I think.

Sharihareswara (WMF) (talkcontribs)

Ping :) Is this still Antoine's RfC or is it something the multimedia group has taken over? If the latter, could you change authorship in the infobox? Thanks!

Hashar (talkcontribs)

We had several mailing list discussion in 2012 / beginning of 2013 regarding optimizing the thumbnails rendering. That RFC is merely a summary of the discussions and is intended to avoid repeating ourself on each discussion. I am not leading the RFC by any mean, would be nice to have the new multimedia team to take leadership there.

Sharihareswara (WMF) (talkcontribs)

Generate statistics for most requested non-existing thumb widths

1
Subfader (talkcontribs)

To add more useful thumb widths to $wgUploadThumbnailRenderMap I generate a list of all thumb widths that are created because the thumb didn't exist yet.

Maybe it's useful to someone: MediaTransformOutput.php > hasFile()

	public function hasFile() {		
		// Generate statistics for most requested non-existing thumb widths
		// includes x1.5 and x2
		global $wgUploadThumbnailRenderMap;
		if( $this->path != null // not when thumb size is >= original and original is returned
		    && !$this->isError() // only if thumb is really created
		    && !in_array( $this->width, $wgUploadThumbnailRenderMap ) // not if is in job list to be created
		  ) {	
			$file = '/path/to/thumbs_not_existing_on_request';
			$current = file_get_contents($file);
			$current .= $this->width . "\n";
			file_put_contents($file, $current);
		}
		
		// If TRANSFORM_LATER, $this->path will be false.
		// Note: a null path means "use the source file".
		return ( !$this->isError() && ( $this->path || $this->path === null ) );
	}

After some hours / days grab the most requested thumb sizes:

sort /path/to/thumbs_not_existing_on_request | uniq -c | sort -rn | head -n 30

Example after some hours:

99 240
60 170
6 320
...  

>> Adding 240 and 170 to $wgUploadThumbnailRenderMap might be worth it.

Reply to "Generate statistics for most requested non-existing thumb widths"

Snapping to an approved size without client side scaling

2
Mattflaschen (talkcontribs)

An option discussed briefly in person (and I believe on the Etherpad) today was to snap to an approved size without client-side scaling. In other words, if you put:

[[File:some-image.jpg|thumb|199px]]

you get something like:

<img width="200" height="244" src="some-image-200px.jpg>

The 199 is silently treated as 200 everywhere. This can definitely still affect layouts (particularly when the difference is greater), but will not cause new client-side scaling.

This post was posted by Mattflaschen, but signed as Superm401.

Brooke Vibber (talkcontribs)

That probably won't work well when trying to use fixed vertical sizes matched on different widths (think of photo galleries, perhaps?), or small sizes where slight pixel counts will dramatically change the size.

Reply to "Snapping to an approved size without client side scaling"
Cscott (talkcontribs)

Note that nlwiki (for example) wishes to have *two* standard thumbnail sizes on the wiki, and is currently using the {{largethumb}} template to do this. This causes various problems for VE.

The solution is probably some more flexible method of applying named "standard image styles" to an image, which would include a default size parameter. Then nlwiki would use something like [[File:Foo.jpg|style=large]] instead of a bespoke template.

Reply to "{{largethumb}}"
Sharihareswara (WMF) (talkcontribs)

Gilles said:

We're not looking to standardize the sizes at the moment, we're just considering rendering specific sizes at upload time or shortly after. More specifically, the buckets that Media Viewer is using. Which currently get generated when the first person uses Media Viewer on a given image.

Gilles had also mentioned in email:

Has the swift capacity been increased yet thanks to the new hardware? If so, could we resume the discussion of "pre"generating specific thumbnail sizes at upload time?
Media Viewer could benefit greatly from this performance-wise. As seen on this graph, the launch to all wikis affected the average considerably, since users started hitting a lot of images that didn't have Media Viewer-sized thumbnails yet: http://multimedia-metrics.wmflabs.org/dashboards/mmv#overall_network_performance-graphs-tab
Thumbnailing improvements are still in the works on our end, and the idea of not using swift anymore for those is definitely on the team's radar (we've started working on more modest thumbnailing improvements at this point), but if the capacity is there, we might as well improve the average image load time for our users, even if the ever-increasing swift use for thumbnail is still a problem itself.


In reply, Mark Bergsma said:

Yes, our Swift capacity expansion has completed. Although the capacity expansion had been planned with reduced storage for thumbs in mind for the future, work that hasn’t started yet, we should have space for this available right now and it shouldn’t be problematic provided that we monitor it well. So please go ahead with this for newly uploaded images - we’ll see how it goes.
Filippo will look into whether Swift’s TTL feature is usable for us these days; hopefully we can use it to reduce storage of unused thumbs / thumb sizes until we move them out of Swift completely.

(about choosing something other than swift in the future)

Yeah. As you know there’s a fair amount of desire to change the way thumb handling & storage works, support more variants (sizes, quality), etc by various teams. I’m starting to think it would be good for us to organise a sprint on this very topic, get the multimedia team, relevant Ops people and some other interested developers from other teams and really dive into these problems.
Reply to "Ops discussion"

Needed for MobileFrontend

1
Kaldari (talkcontribs)

Several features in MobileFrontend are now surfacing thumbnails in association with article lists: search, nearby, related articles, watchlist, etc. In order for these features to work smoothly we would really like to have standardized, pre-rendered thumbnails.

Reply to "Needed for MobileFrontend"
213.61.9.75 (talkcontribs)

Concerning the goal to minimize cpu load in generating an excessive amount of thumbnails for a single image, the presented statistics of different thumbnail sizes isn't necessarily meaningful. If each of the images would only ever have exactly one thumbnail generated, it would not harm anyone if the thumbnails all have various and weird sizes. The main problem is that one image has dozens of different thumbnails; so a statistics like "1234 images have 7 thumbnails" would be more meaningful to calculate the possible impact in saving of CPU/storage for additionally generated thumbnails.

Concerning possible sizes of thumbnails, if the space used by them is an issue in any way, I would suggest that there is a predefined list of possible sizes, and an algorithm which takes as input N desired thumbnail sizes, and outputs M thumbnail sizes to be generated/kept, so that at most M thumbnails would be kept.

For the list of those N thumbnail sizes, some suggested ones should be compared with the presented sizes statistics to assess one that matches the best in terms of "minimal amount of scaling needed".

Reply to "Meaningful statistics"
Richardguk (talkcontribs)

The RFC states the second part of the problem as: "the thumbnails are hard to cache properly since we have to cache a copy of each of the sizes". But there might be a better caching strategy than simply refusing to cache non-standard sizes.

Potential thumbnail strategies:

Serving
(A) serve arbitrary sizes as requested;
(B) serve only preferred sizes (rounding non-standard sizes according to some algorithm).
Storing
(1) store every requested thumbnail indefinitely;
(2) store only preferred-size thumbnails indefinitely.
Caching
(i) retain indefinitely (ad hoc deletion);
(ii) periodically delete the longest-unserved thumbnails (LRU), regenerating and recaching them if re-requested.

The status quo seems to be A.1.i. The proposal seems to be to switch to B.2.i or possibly A.2.i (I think Brion implies above that A.2 is "Tim's position").

But to prevent excessive thumbnail file storage, all that is needed is a better caching strategy. "Least Recently Used" seems like a sound but simple and non-disruptive basis for predicting which thumbnails will not be requested again.

So, instead of inconveniencing users by switching serving strategy or storing strategy, why not just switch caching strategy from (i) to (ii)?

Hashar (talkcontribs)

Seems to be a good amendment to the "possible solution" part. Would you mind enhancing it with your above text? We still have to expose the exact problem though.

Reply to "Caching strategy"
Return to "Requests for comment/Standardized thumbnails sizes" page.