Architecture meetings/RFC review 2014-04-16

2100-2200 UTC April 16th, at #wikimedia-office connect.

Requests for Comment to reviewEdit

  1. Requests for comment/Reducing image quality for mobile

Summary and logsEdit

Meeting summaryEdit

  • LINK: https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-04-16 (sumanah, 21:03:18)
  • Today is probably going to be a short meeting - just 1 RfC on the agenda (sumanah, 21:03:27)
  • Reducing image quality for mobile (sumanah, 21:03:31)
    • LINK: https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile (sumanah, 21:03:54)
    • I asked Yuri what he wanted: 1) an ok from ops to increase thumbnail storage by 2-3% and number of files by 15%, 2) from core/tim/etc to proceed with the proposed patch <yurik> assuming my proposed path is satisfactory to everyone's involved (sumanah, 21:04:15)
    • LINK: https://gerrit.wikimedia.org/r/#/c/119661/ Gerrit changeset, "Allow mobile to reduce image quality" (sumanah, 21:09:33)
    • comments were provided on the image quality gerrit patch (TimStarling, 21:42:33)
    • LINK: https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax on wikitext addition (sumanah, 21:42:56)
    • image scaler backend relatively uncontroversial -- HTML/URL manipulation to access that API is more complex (TimStarling, 21:43:27)
    • gwicke predictably favours Node.JS service (TimStarling, 21:44:42)
    • <yurik> ok, all settled, will implement the first step (core patch), and start implementing JS magic (sumanah, 21:48:52)
    • required modifications: use string instead of integer "qlow-100px-image.jpg", make it JPG only (no png) (yurik, 21:50:31)
    • Tim skeptical about client-side JS rewrite: potential for CPU usage, flicker, image load aborts, browser incompatibilities, etc. (TimStarling, 21:54:34)
  • Next week - Associated namespaces (sumanah, 21:57:01)
    • LINK: https://www.mediawiki.org/wiki/Requests_for_comment/Associated_namespaces Next week David Cuenca wants to find out whether there are any objections to the "Namespace registry and association handlers" that Mark proposed, discuss possible problems with his proposed approach, and see if there would be any hands available to work on it. He mentioned that "I hope this RFC moves forward because it affects important upcoming and already depl (sumanah, 21:57:07)


Full logEdit

See in HTML or see below.

Meeting logs


21:02:02 <sumanah> #startmeeting RfC review: reducing image quality for mobile | Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). https://meta.wikimedia.org/wiki/IRC_office_hours
21:02:02 <wm-labs-meetbot> Meeting started Wed Apr 16 21:02:02 2014 UTC and is due to finish in 60 minutes.  The chair is sumanah. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:02:02 <wm-labs-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:02:02 <wm-labs-meetbot> The meeting name has been set to 'rfc_review__reducing_image_quality_for_mobile___channel_is_logged_and_publicly_posted__do_not_remove_this_note___https___meta_wikimedia_org_wiki_irc_office_hours'
21:02:21 * sumanah waits for Brion
21:02:38 <sumanah> #chair sumanah TimStarling
21:02:38 <wm-labs-meetbot> Current chairs: TimStarling sumanah
21:03:13 <sumanah> #chair sumanah TimStarling brion
21:03:13 <wm-labs-meetbot> Current chairs: TimStarling brion sumanah
21:03:15 * brion waves
21:03:18 <sumanah> #link https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-04-16
21:03:27 <sumanah> #info Today is probably going to be a short meeting - just 1 RfC on the agenda
21:03:31 <sumanah> #topic Reducing image quality for mobile
21:03:42 <TimStarling> the patch seems quite different to what yurik and I discussed at the architecture summit
21:03:50 <sumanah> ( but brion TimStarling - I may ask some follow-up questions at the end about a few other RfCs and pending things)
21:03:54 <sumanah> #link https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile
21:04:15 <sumanah> #info I asked Yuri what he wanted: 1) an ok from ops to increase thumbnail storage by 2-3% and number of files by 15%, 2) from core/tim/etc to proceed with the proposed patch <yurik> assuming my proposed path is satisfactory to everyone's involved
21:04:19 <TimStarling> I thought that you should have only quality classes exposed, not expose an API allowing any integer percentage quality
21:04:59 <yurik> TimStarling, it would be fairly easy to change from a number to a string constant
21:05:11 <TimStarling> you suggest 30% but probably every mobile app will choose something different
21:05:12 <yurik> if this is a requirement of course
21:06:37 <yurik> TimStarling, this is similar to the problem we face with the thumbnail dimension  - every wiki varying images by a few pixels. I propose a somewhat different solution here - an extension that does filtering/rounding of these numbers during the rendering
21:07:04 <sumanah> thedj: dfoy_ - http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20140416.txt for the logs up till now
21:07:21 <TimStarling> I don't see any filtering or rounding in the patch
21:07:41 <yurik> example: user requested 240x250 image - the ext would say 250x250 already exists, or it is a multiple of 50, hence render it as a link to 250x250, with width=240
21:08:00 * aude waves
21:08:11 <aude> yurik: is this something your extension would do? rather than core?
21:08:12 <sumanah> Hi :)
21:08:12 <yurik> separate patch - as an extension - to address all such rounding requirements for both image size & quality
21:08:16 <TimStarling> yeah, you can read my thoughts on that on the relevant RFC
21:08:47 <yurik> aude, not ours, a new extension whose job is only to "standardize" on thumbnail generation
21:08:54 <AaronSchulz> gah
21:08:59 <aude> but not core?
21:09:11 <sumanah> AaronSchulz: I presume you think that's the wrong approach :)
21:09:28 * brion just added comment on the patch agreeing with idea to use quality classes rather than expsoing full integer range
21:09:33 <sumanah> #link https://gerrit.wikimedia.org/r/#/c/119661/ Gerrit changeset, "Allow mobile to reduce image quality"
21:09:34 <yurik> no, i think core should be more flexible - depending on the site
21:09:46 * aude prefers we allow any size, but not keep cached so long if it's not requested
21:09:58 <aude> if that's feasible
21:10:09 <TimStarling> me too
21:11:37 <thedj> can i ask what the primary purpose is ?
21:11:46 <thedj> reduce time to load ?
21:12:01 <sumanah> thedj: honest question: does the RfC address that? do you think the RfC should be clearer about the problem being solved?
21:12:03 <yurik> reducing quality? to lower bandwidth consumption
21:12:36 <thedj> yurik: so download time and download cost ?
21:13:30 <yurik> both
21:13:36 <thedj> Do we have some metrics/ideas to give us indications of how much benefit that would translate into ?
21:13:43 <yurik> especially when the bandwidth is donated
21:14:22 <yurik> thedj, 30-40%
21:14:55 <thedj> ah k. so it's to a large degree from the zero perspective that we want to do this.
21:15:01 <yurik> correct
21:15:36 <brion> i could see it being handy for hi-dpi devices as well, we could serve the double-size images with a medium quality setting to trade-off brandwidth and visual quality
21:15:38 <sumanah> BTW, for those who haven't looked, we now have a few more comments on the changeset https://gerrit.wikimedia.org/r/119661 in the last few minutes
21:15:50 <brion> but definitely the incentive is where we’re pushing donated bandwidth :)
21:15:55 <sumanah> (there's our brion always looking out for responsive design & gadget stuff :) )
21:16:19 <bawolff> My comment was just that it shouldn't touch the -quality setting on pngs, and a nitpick on the commit message
21:17:11 <gwicke> once we move to HTMl storage, is the idea to implement this as a DOM post-processing step?
21:17:39 <yurik> TimStarling, brion, please take a look at the https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile#Possible_approaches
21:18:15 <yurik> it discusses the 3 paths to do this, with 1 path doing everything internally without exposing it via URL
21:18:43 <brion> *nod* i was assuming the first pass implementation once the qualitys etting was available...
21:18:52 <TimStarling> probably option 2
21:18:54 <brion> … was to do it as a dom postprocess step in mf+zero
21:19:09 <TimStarling> that's not on the list
21:19:24 <yurik> that's #3 i think
21:19:26 <brion> agh, i confused that with the js one
21:19:59 <yurik> tim, you think it is better to let varnish do automagical image url rewrite?
21:20:19 * AaronSchulz prefers js if possible
21:20:31 <TimStarling> how would it work with JS?
21:20:38 <yurik> because we won't have as much info in varnish, plus we would have to put too much biz-logic in varnish (ops won't like it)
21:20:43 <gwicke> one issue I see with Varnish is transparent downstream caches
21:20:52 <yurik> yes, that too
21:20:55 <TimStarling> a DOM ready event?
21:20:55 <gwicke> the third option (JS) avoids that
21:21:03 <yurik> JS would rewrite the URL
21:21:10 <brion> hmm
21:21:24 <brion> my main concern with that is rewriting urls in JS without often loading the original url is tricky
21:21:35 <TimStarling> I am wondering what the CPU requirements of option 3 are
21:21:37 <AaronSchulz> gwicke: related to downstream caches is handling purges
21:21:51 <TimStarling> and whether there will be flicker, browser incompatibilities, etc.
21:21:58 <gwicke> AaronSchulz, *nod*
21:22:02 <AaronSchulz> I guess if it's the very frontend cache it's fine
21:22:09 <TimStarling> we can't really waste the CPU of phones the same way we can desktop browsers
21:22:17 <gwicke> we'd have to send s-maxage-0
21:22:19 <yurik> workflow:    zero ext changes src= to low quality,   JS changes it back to highres if device/network is good
21:22:21 <gwicke> =0
21:22:34 <brion> :\
21:22:45 <yurik> how expensive is a JS image tag search?
21:22:55 <gwicke> it's pretty cheap I believe
21:23:07 <brion> replacing them may be slow if it’s a big page with lots of images though
21:23:19 <gwicke> one querySelectorAll call
21:23:24 <brion> and you’ve got the issue of loading the original images and then the new ones....
21:23:25 <TimStarling> image loading will start as soon as the img tag is created, right?
21:23:27 <yurik> percentage wise i still think it won't be much
21:23:35 <AaronSchulz> TimStarling: I think so :/
21:23:45 <gwicke> yeah, I think that's the bigger issue
21:23:57 <gwicke> we have a similar issue with the thumb size pref
21:23:58 <yurik> that's the big question - can the low->high quality img tag replacement be done before browser starts loadnig them?
21:24:07 <TimStarling> what about what brion said, why is that not an option?
21:24:19 <TimStarling> <brion> … was to do it as a dom postprocess step in mf+zero
21:24:22 <gwicke> if we can find a way to suppress the original thumb load before resizing / quality downgrading, then that would be awesome
21:24:46 <yurik> TimStarling, we would have to do it anyway, but there will be users who would want high-end images
21:25:03 <brion> i think we’re trying to avoid having php-time cacheable differences on zero….. it’s all very scary
21:25:35 <brion> in general, trying to scale for estimated network bandwidth is just a tricky tricky business
21:26:45 <sumanah> tfinc: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20140416.txt for chat so far
21:27:18 <yurik> there is another question - i am pretty sure there are many mobile users out there who don't have zero and who might want low bandwidth too
21:27:24 <TimStarling> what about having a separate new service to do DOM rewriting?
21:27:49 <yurik> so we really should have a mobile setting "auto/always high/always low"
21:28:01 <TimStarling> yurik: those users can put up with what we give them
21:28:05 <gwicke> TimStarling, that's doable for low volume
21:28:26 <gwicke> which zero is afaik
21:28:57 <gwicke> what are the peak request rates on zero in pages / s ?
21:29:00 <yurik> well, not those who are still on 2G, or who is paying high price for their internet.
21:29:24 <TimStarling> it's out of scope
21:29:31 <aude> yurik: then i'd want no images, if concerned about bandwidth (imho)
21:29:40 <aude> maybe my mobile browser allows that
21:29:45 <sumanah> Those who have questions for Max, he's here now
21:29:54 <TimStarling> the problem is complicated enough when it is just Zero
21:30:09 <dr0ptp4kt> fwiw, MobileFrontend already has an Images on/off toggle
21:30:13 <sumanah> (OK, maybe today's meeting WON'T be a short one after all.)
21:31:17 <TimStarling> dr0ptp4kt: does it work?
21:31:26 <TimStarling> or do the images start loading and then get aborted?
21:31:34 <dr0ptp4kt> TimStarling: it is completely rewritten html
21:31:40 <dr0ptp4kt> it works
21:31:40 <MaxSem> it works via DOM rewriting on PHP side
21:31:53 <gwicke> there are ways to parse html without loading images, using https://developer.mozilla.org/en-US/docs/Web/API/DOMParser for example
21:32:27 <gwicke> or XMLHttpRequest
21:32:27 <dr0ptp4kt> one caveat is supporting devices that don't support javascript, or rather "advanced javascript" as determined by rl
21:32:35 <TimStarling> gwicke: well, that's the kind of thing that I would expect to use a lot of client-side CPU
21:32:48 <gwicke> not really- it's using the normal html parser
21:32:59 <gwicke> it does rely on JS support though
21:33:05 <gwicke> and a non-sucky browser
21:33:13 <dr0ptp4kt> HA!
21:33:20 <gwicke> ;)
21:33:26 <MaxSem> I don't think that many devices we want to support will work well with this
21:33:53 <gwicke> are you using XMLHttpRequest currently?
21:34:06 <MaxSem> libxml2
21:34:14 <MaxSem> be its name foreveer cursed
21:34:22 <TimStarling> can someone give me a quick overview of how HTML delivery in MF works and what the plans for it are?
21:34:42 <dr0ptp4kt> we use xhr opportunistically. so it's usually to upgrade the experience, like avoid server roundtrips for newer phones
21:34:58 <dr0ptp4kt> er, bigger roundtrips
21:35:15 <MaxSem> шеэы ыешдд мукн кщгпр щт увпуы
21:35:18 <gwicke> I see, so you are hesitant to require it
21:35:25 <TimStarling> preferably in a latin script
21:35:28 <yurik> MaxSem, +2
21:35:34 <MaxSem> it's still quite buggy so is used only in alpha
21:35:54 <dr0ptp4kt> sumana, would you please wire up a translation bot now? :)
21:36:02 <MaxSem> plans are to  fix it
21:36:06 <MaxSem> ...eventually
21:36:11 <MaxSem> ...maybe
21:36:35 <gwicke> I don't see an issue with DOM post-processing on the server and storing that HTML back
21:36:36 <dr0ptp4kt> yeah, the xhr for w0 is more like getting runtime config to do things ahead of caches being purged (e.g., add zero-rated support for an additional language)
21:36:58 <sumanah> dr0ptp4kt: I think here it would just emit those cartoon profanity things, like $%#%@
21:37:17 <gwicke> as long as there are only a few variants and the transforms build on a known DOM spec that should work well
21:37:52 <yurik> gwicke, zero already does a DOM post-parse rewrite to replace all external URL links with special warning URLs
21:37:59 <MaxSem> I would reeeeeally love to avoid doing it in PHP again
21:38:21 <gwicke> it's fairly easy in JS
21:38:27 <gwicke> you can use jquery etc
21:38:37 <MaxSem> wouldn't be lethal for zero which already does HTML transformations, but  still sucks
21:38:50 <yurik> gwicke, assuming flip phone has it :(
21:38:59 <gwicke> yurik, I mean on the server
21:39:38 <yurik> do we have a framework for node.js extensions?
21:39:57 <gwicke> yurik, we have HTTP..
21:40:08 <gwicke> set up a service, make requests to it
21:40:24 <sumanah> So we're about 2/3 through the hour and I'm not sure what to #info :)
21:40:46 <yurik> gwicke, you mean PHP becomes a proxy to another service on internal network?
21:41:07 <yurik> in any case, this is an optimization for the future, outside of the scope imho
21:41:19 <TimStarling> sumanah: three of us wrote comments on the gerrit change
21:41:36 <gwicke> yurik, you can go through PHP if you want; depends on whether it adds info that would be hard to get otherwise
21:42:04 <AaronSchulz> do we actually need the wikitext syntax addition too?
21:42:14 <MaxSem> definitely not
21:42:16 * AaronSchulz leans toward not adding it
21:42:17 <brion> i think we don’t need the wikitext addition no
21:42:28 <brion> keep it opaque to that layer
21:42:32 <AaronSchulz> right
21:42:33 <TimStarling> #info comments were provided on the image quality gerrit patch
21:42:33 <brion> it’s a presentation-layer decision
21:42:34 <sumanah> !link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax
21:42:50 <sumanah> er
21:42:53 <gwicke> -1 on the extra syntax
21:42:56 <sumanah> #link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax on wikitext addition
21:43:03 <sumanah> can I say #agreed ? :)
21:43:07 <bawolff> +1 on not adding extra options to file syntax
21:43:27 <yurik> bawolff, how do you mean?
21:43:27 <TimStarling> #info image scaler backend relatively uncontroversial -- HTML/URL manipulation to access that API is more complex
21:43:38 <yurik> we need to distinguish low-quality URLs from the highs
21:43:41 <bawolff> yurik: I'm agreeing with everyone
21:44:03 <yurik> good position :)
21:44:05 <bawolff> yurik: as in not adding [[File:foo.jpg|quality=20]] ui
21:44:15 <yurik> gotcha
21:44:42 <TimStarling> #info gwicke predictably favours Node.JS service
21:44:48 <AaronSchulz> lol
21:44:53 <gwicke> hehe
21:45:06 <gwicke> that's in response to MaxSem's lament about libxml2 ;)
21:45:34 <MaxSem> having a service for that would be even more cruffty
21:45:39 <yurik> ok, i will change the URL syntax to    image.jpg/100px-qlow-image.jpg   this way we can later change it to some other magic keywords
21:45:45 <sumanah> So it's sounding like people think this is a relatively uncontroversial idea overall and we're just talking about implementation, right?
21:46:08 * tfinc reads the backscroll
21:46:20 <yurik> any objections to that URL format?
21:46:32 <bawolff> yurik: Maybe re-order those parameters. Easier to regex out qlow-100px from the actual name of the file
21:46:37 <sumanah> "this" being the RfC as a whole
21:46:53 <MaxSem> +1
21:46:54 <bawolff> since we're going to be presumably keeping 100px-image.jpg for the normal quality image
21:47:15 <gwicke> are we sure that we need a different URL?
21:47:21 <MaxSem> yes
21:47:29 <MaxSem> varnish rewrites are evil
21:48:16 <gwicke> do we already have info about zero ip ranges in varnish?
21:48:27 <yurik> ok, all settled, will implement the first step (core patch), and start implementing JS magic
21:48:43 <MaxSem> gwicke, for all that is holy, don't
21:48:44 <yurik> gwicke, yes, varnish detects zero based on ip
21:48:52 <sumanah> #info <yurik> ok, all settled, will implement the first step (core patch), and start implementing JS magic
21:49:15 <MaxSem> especially since now only mobile varnishes know about zero
21:49:19 <gwicke> hmm, then it might not actually be that hard to use that for image request rewriting
21:50:31 <yurik> #info required modifications: use string instead of integer  "qlow-100px-image.jpg", make it JPG only (no png)
21:50:31 <gwicke> I'd be against adding that info if it wasn't there already; but since it's already there it seems that the extra complexity would be fairly limited
21:50:46 <TimStarling> varnish doesn't have a lot of string handling built in, but you can use inline C, I did it once...
21:51:17 <MaxSem> regexping it would actually be possiblee
21:51:34 <MaxSem> but still this would SUCK
21:52:00 <sumanah> I have a few min of "what's up next week + other RfC news you should be aware of" to say before the end of the hour.
21:52:04 <sumanah> Any closing statements?
21:52:39 <TimStarling> modules/varnish/templates/vcl/wikimedia.vcl.erb was my own little bit of varnish URL manipulation
21:53:30 <yurik> if we are done, would love to get +2 for https://gerrit.wikimedia.org/r/#/c/109853/
21:54:34 <TimStarling> #info Tim skeptical about client-side JS rewrite: potential for CPU usage, flicker, image load aborts, browser incompatibilities, etc.
21:55:21 <gwicke> avoiding a double-load is hard afaik
21:55:41 <TimStarling> which is an argument for doing it on the server side
21:55:49 <AaronSchulz> yeah it may not be possible to use JS
21:55:50 <gwicke> or in Varnish
21:56:04 <AaronSchulz> so it's 1-2
21:56:18 <TimStarling> we have so many powerful tools on the server side now, we shouldn't be so keen to offload processing
21:56:36 <sumanah> ok, I'm gonna wrap up with a couple other #topics
21:57:00 <gwicke> for normal desktop page views the thumb size pref is pretty much the only one that can't be easily handled in CSS
21:57:01 <sumanah> #topic Next week - Associated namespaces
21:57:07 <sumanah> #link https://www.mediawiki.org/wiki/Requests_for_comment/Associated_namespaces Next week David Cuenca wants to find out whether there are any objections to the "Namespace registry and association handlers" that Mark proposed, discuss possible problems with his proposed approach, and see if there would be any hands available to work on it. He mentioned that "I hope this RFC moves forward because it affects important upcoming and already depl
21:57:08 <sumanah> oyed projects (Commons migration, templates, Visual editor, WD, etc)."
21:57:15 <sumanah> er: "it affects important upcoming and already deployed projects (Commons migration, templates, Visual editor, WD, etc).""
21:57:45 <sumanah> #topic RfC news
21:57:53 <gwicke> so if we can find a way to do this in Varnish it might be possible to implement those prefs purely in CSS
21:58:03 <sumanah> #link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Simplify_thumbnail_cache Mark Bergsma and Aaron Schulz just left some comments on the "Simplify thumbnail cache" RfC - if you're into that one, check them out
21:58:16 <sumanah> #info Pau Giner has updated his grid system RfC https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Grid_system with more detail, and has submitted a patchset to Gerrit https://gerrit.wikimedia.org/r/#/c/125387/ so that the discussion can get more specific. Also see the example implementation http://pauginer.github.io/agora-grid/
21:58:34 <sumanah> #info http://www.gossamer-threads.com/lists/wiki/wikitech/451921 "REST and SOA within MediaWiki - is my understanding right?" includes gwicke saying, "ideally the only code that directly talks to the database would live in a storage service, which exposes a REST API." which refers to https://www.mediawiki.org/wiki/Requests_for_comment/Storage_service in case you want to take a look at that
21:58:49 <sumanah> #info bd808 needs feedback on his structured logging patch - see http://lists.wikimedia.org/pipermail/wikitech-l/2014-April/075921.html
21:59:24 <sumanah> And as always I welcome your suggestions of what RfCs to talk about in these meetings next - and who specifically needs to be in those chats so we can sometimes change the timing
21:59:27 <sumanah> That's all from me.
21:59:57 <sumanah> yurik: did you get what you wanted (somewhat) today? :)
22:00:03 <sumanah> MaxSem: ^ (same question)
22:00:13 <sumanah> TrevorParscal: do you have an RfC that needs chatting about sometime soon?
22:00:17 <sumanah> (for instance)
22:00:32 <sumanah> #endmeeting