Scrum of scrums/2015-02-11

Facilitating: Grace Gellerman

Apps edit

  • We'll be meeting soon to go over the RESTBase / Node.js service content spike

Parsoid edit

  • We spent part of last week figuring out load issues on the Parsoid cluster
    • The deploy of https://gerrit.wikimedia.org/r/#/c/173834/ on Jan 28th exposed a few latent bugs in Parsoid on a tiny subset of pages. https://phabricator.wikimedia.org/T88864 in particular caused those pages to send Parsoid into an infinite loop, and our timeout handling wasn't properly killing these stuck processes in all cases. This caused load spikes Thursday and Friday because of repeated retries on 2 enwiki and 3 plwiki pages. The repeated retry from the job queue is https://phabricator.wikimedia.org/T85939. The infinite loop bug has since been fixed and the fix deployed and load is back to normal since Saturday.
  • Continued focus on VE goals
    • Will deploy code today to reduce size of parsed HTML (by stripping some private attributes that are no longer necessary).
    • Among others, https://phabricator.wikimedia.org/T88495 is close to being resolved.
    • Marc has started work on dependent tasks that are required to reduce size of references HTML ( https://phabricator.wikimedia.org/T88290 .. heads-up to the Content Translation team about upcoming changes to <ref>.data-mw)

RelEng/QA edit

  • Browser tests found a lot of bugs in several repos this week. Nothing blocking that we know of.
  • Search in beta labs has been broken since Monday, sorry
  • We plan this week to remove browser test builds that target test2wiki. Please let us know if you object.
  • Upcoming: errors and fatals in production logs need more attention, and we'll be working on getting them the attention they deserve: https://phabricator.wikimedia.org/T89049

Security edit

  • Security release soonish
  • Gerrit 187728 (mobile) and T78730 (ops) - in progress
  • Review for SMTP errors (ops), Capiunto (wikidata) starting soon

MW core edit

  • Bryan working on T88732 (Decouple logging infrastructure failures from MediaWiki logging)
  • Ori started discussion about paying attention to logs and fixing problems
  • Wikidata Query Service continuing to review alternatives to Titan
  • Documenting authorization use-cases/user stories to flesh out AuthStack RfC
  • Draft RfC for Multi-Datacenter concerns <https://www.mediawiki.org/wiki/Requests_for_comment/Master_%26_slave_datacenter_strategy_for_MediaWiki>
  • Preparing for security release with fixes for issues found by iSec security review

Fundraising Tech edit

  • Working on language variant support for upcoming China campaigns
  • More DonationInterface cleanup
  • More internal dash customization & a/b testing widget planning
  • Deploying some CentralNotice performance improvements

Services edit

  • RESTBase deployment - early next week
    • HW config still pending (should be completed early next week) - T76986
    • public API endpoint - T78194

Analytics edit

Mobile Web edit

  • Fixed Central Auth bug by putting all the 1x1s in a div
  • Auth sharing on wikimedia.org mobile domains still broken (https://phabricator.wikimedia.org/T88860)
  • Still working on server-side HTML templating in core

Language edit

  • Cleaning up bugs
  • Expanding language set
  • Working on Yandex integration
  • Statistics page
  • Working on API versioning
  • Extension registration for CX

Otto has been out for a week, don't know any ops updates.

  • T76986 RESTBase production hardware - in progress. Should be able to rack them next week.

Editing edit

  • Regression with auto-height TextInputWidgets in OOjs UI 0.6.6, fixed in master, backport for production coming
  • Ops now responding on https://phabricator.wikimedia.org/T76308 , seems to be moving along