Wikimedia Engineering/Report/2014/June

Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

Engineering metrics in June:

  • 151 unique committers contributed patchsets of code to MediaWiki.
  • The total number of unresolved commits went from around 1440 to about 1575.
  • About 14 shell requests were processed.

Upcoming events

edit

There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.

For a more complete and up-to-date list, check out the Project:Calendar.

Date Type Event Contact
8 July 2014   ECT: Office Hour, 1600 UTC, #wikimedia-office qgil
10 July 2014   IRC discussion of frontend and UX standardization , 22:30-23:30 UTC Trevor Parscal
15 July 2014   Tech Talk: Hadoop and Beyond. An overview of Analytics infrastructure at 17:00 UTC, IRC: #wikimedia-dev NRuiz (WMF) (talk)
16 July 2014   IRC discussion of the Vertical writing support proposal , 1800-1900 UTC Yair rand
23 July 2014   IRC discussion of the proposal that we support Composer-managed libraries on the WMF cluster , 2100-2200 UTC Bryan Davis
29 July 2014   Tech Talk: HHVM in production: what that means for Wikimedia developers at 19:00 UTC, #wikimedia-dev connect Ori.livneh (talk)
30 July 2014   IRC discussion of Requests for comment/CentralNotice Caching Overhaul - Frontend Proxy , 22:00-23:00 UTC in #wikimedia-office connect. Matt Walker

Personnel

edit

Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

Announcements

edit
  • Elliot Eggleston joined the Wikimedia Foundation as a Features Engineer in the Fundraising-Tech team (announcement).

Technical Operations

edit

New Dallas data center

On-site work has started in our new Dallas (Carrollton) data-center (codfw). Racks have been installed, the equipment we moved from Tampa has been racked and cabling work has been mostly completed over the course of the month. We are now awaiting the installation of connectivity to the rest of our network as well as the arrival of the first newly-ordered server equipment, so server & network configuration can commence.

Puppet 3 migration

In July we migrated from Puppet 2 to Puppet 3 on all production servers. Thanks to the hard work of both volunteers and Operations staff on our Puppet repository in the months leading up to this, this migration went very smoothly.

Labs metrics in June:

  • Number of projects: 173
  • Number of instances: 424
  • Amount of RAM in use (in MBs): 1,741,312
  • Amount of allocated storage (in GBs): 19,045
  • Number of virtual CPUs in use: 855
  • Number of users: 3,356

Wikimedia Labs

Last month we switched the Labs puppetmaster to Puppet 3; this month all instances switched over as well. Some cleanup work was needed in our puppet manifests to handle Trusty and Puppet 3 properly; everything is fairly stable now but a bit of mopping up remains.

Editor retention: Editing tools

edit

VisualEditor [edit]

In June, the VisualEditor team provided a new way to see the context of links and other items when you edit to make this easier, worked on the performance and stability of the editor so that users could more swiftly and reliably make changes to articles, and made some improvements to features focussed on increasing their simplicity and understandability, fixing 94 bugs and tickets. The editor now shows with a highlight where dragging-and-dropping content will put it, and works for any content, not just for images. The citation and reference tools had some minor adjustments to guide the user on how they operate, based on feedback and user testing. A lot of fixes to issues with windows opening and closing, and especially the link editing tool, were made, alongside the save dialog, categories, the language editing tool, table styling, template display and highlights on selected items. The mobile version of VisualEditor, currently available for alpha testers, moved towards release, fixing a number of bugs and improving performance. Work to support languages made some significant gains, and work to support Internet Explorer continued. The new visual interface for writing TemplateData was enabled on the Catalan and Hebrew Wikipedias. The deployed version of the code was updated four times in the regular release cycle (1.24-wmf8, 1.24-wmf9, 1.24-wmf10 and 1.24-wmf11).

Parsoid [edit]

In June, the Parsoid team continued with ongoing bug fixes and bi-weekly deployments; the selective serializer, improving our parsing support for some table-handling edge case, nowiki handling, and parsing performance are some of the areas that saw ongoing work. We began work on supporting language converter markup.

We added CSS styling to the HTML to ensure that Parsoid HTML renders like PHP parser output. We continued to tweak the CSS based on rendering differences we found. We also started work on computing visual diffs based on taking screenshots of rendered output of Parsoid and PHP HTML. This initial proof-of-concept will serve as the basis of more larger scale automated testing and identification of rendering diffs.

The GSoC 2014 LintTrap project saw good progress and a demo LintBridge application was made available on wmflabs with the wikitext issues detected by LintTrap.

We also had our quarterly review this month and contributed to the annual engineering planning process.

Core Features

edit

Flow/Project information [edit]

 
Presentation slides on Flow from the metrics meeting for June

In June, the Flow team finished an architectural re-write for the front-end, so Flow will be easier to keep updating in the future. This will be released to MediaWiki.org the first week of July, and Wikipedia the following week.

The new feature in this release is the ability to sort topics on a Flow board. There are now two options for the order that topics appear on the board: you can see the most recently created threads at the top (the default), or the most recently updated threads. This new sorting option makes it easier to find the active conversations on the board.

We've also made a few changes to make Flow discussions easier to read, including: a font size now consistent with other pages; dropdown menus now easier to read; the use of the new button style, and the WikiGlyphs webfont.

Growth

edit

Growth [edit]

In June, the Growth team completed analysis of its first round of A/B testing of signup invitations for anonymous editors on English, French, German, and Italian Wikipedias. Based on these results, the team prepared a second version to be A/B tested.

Additionally, the team released a major refactor of the GuidedTour extension's API, as well as design enhancements like animations, a new CSS-based way of drawing guider elements, updated button styles, and more. The team also launched GuidedTours on three new Wikipedias: Arabic, Norwegian, and Bengali.

Support

edit

Wikipedia Education Program [edit]

This month, the Education Program extension again received incremental improvements and bugfixes. Sage Ross of the Wiki Education Foundation submitted two patches: one that adds information to the API for listing students, and another that lets anonymous users compare course versions. Also, a student from Facebook Open Academy fixed a usability issue in the article assignment feature.

Wikimedia Apps [edit]

The Mobile Apps team released the new Android Wikipedia app and it is now available to be downloaded through the Google Play store on Android devices.

Core features of the app include the ability to save pages for offline reading, a record of your browsing history, and the ability to edit either as a logged in user or anonymously. Therefore the app is the first mobile platform that allows anonymous editing! The app also supports Wikipedia Zero for participating mobile carriers.

Additional work done this month includes the start of implementing night mode for the Android app (by popular demand), creating an onboarding experience which is to be refined and deployed in July, and numerous improvements to the edit workflow.

Mobile web projects [edit]

This month, the mobile web team finished work on styling the mobile site to provide a better experience for tablet users. We began redirecting users on tablets, who had previously been sent to the desktop version of all Wikimedia projects, to the new tablet-optimized mobile site on June 17. Our early data suggests that this change had a positive impact on new user signup and new editor activation numbers. We also continued work on VisualEditor features (the linking and citation dialogs) in preparation for releasing the option to edit via VisualEditor to tablet users in the next three months.

Wikipedia Zero [edit]

During the last month, the team deployed the refactored Wikipedia Zero codebase that replaces one monolithic extension with multiple extensions. The JsonConfig extension, which allows a wiki-driven JSON configuration system with data validation and a tiered configuration management architecture, had significant enhancements to make it more general for other use cases.

Additionally, the team enabled downsampled thumbnails for a live in-house Wikipedia Zero operator configuration, and finished Wikipedia Zero minimum viable product design and logging polish for the Android and iOS Wikipedia apps. The team also supported the Wikipedia apps development with network connection management enhancements in Android and iOS, with Find in page functionality for Android, and response to Wikipedia for Android Google Play reviews.

The team facilitated discussions on proxy and small screen device optimization, and examined the HTML5 app landscape for the upcoming fiscal year's development roadmap. The team also created documentation for operators for enabling zero-rating with different connection scenarios. Bugfixes were issued for the mobile web Wikipedia Zero and the Wikipedia for Firefox OS app user experience.

Routine pre- and post-launch configuration changes were made to support operator zero-rating, with routine technical assistance provided to operators and the partner management team to help add zero-rating and address anomalies. Finally, the team participated in recruitment for a third Partners engineering teammate.

Wikipedia Zero (partnerships)

We launched Wikipedia Zero with Airtel in Bangladesh, our third partner in Bangladesh, and our 34th launched partner overall. We participated in the Wiki Indaba conference, the first event of its kind to be held in Africa. The event, organized by Wikimedia South Africa, brought together community members from Tunisia, Egypt, Ghana, Kenya, Namibia, Nigeria, Ethiopia, Malawi and South Africa. The attendees shared experiences and challenges to work in the region and formulated strategies to support and strengthen the movement's efforts across the continent. While in South Africa, Adele Vrana also met with local operators. Meanwhile, Carolynne Schloeder met with numerous operators and handset manufacturers in India. Carolynne joined Wikimedian RadhaKrishna Arvapally for a presentation at C-DOT, and both participated a blogger event hosted by our partner Aircel, along with other members of Wikimedia India in Bangalore. Smriti Gupta joined the group as Mobile Partnerships Manager, Asia.

Language tools [edit]

Content translation [edit]

The team added support for link adaptation, worked on the infrastructure for machine translation support using Apertium and on hiding templates, images and references that cannot be easily translated. They also prepared for deployment on beta wikis and made multiple bug fixes and design tweaks.

MediaWiki Core

edit

HHVM [edit]

The team has been running HHVM on a single test machine ("osmium") for the purpose of testing the job queue in production. The machine is only put into production on a very limited basis, while enough bugs are found to keep the team busy for a while, and then it's disabled again as the team fixes those bugs. We're planning on having HHVM running on a few job runner machines (continually) in July, then turning our focus toward running HHVM on the main application servers, taking a similar strategy.

Wikimedia Release and QA Team [edit]

The Release and QA Team had their mid-quarter check-in on June 27. Phabricator work is progressing nicely. The latest MediaWiki tarball release (1.23) was made and the second RFP started and is close to completion. We are moving to only WMF-hosted Jenkins for all jobs, and we are working with the MediaWiki Core and the Operations teams on HHVM-related integration (both for deployment and for the Beta Cluster).

Admin tools development [edit]

Work on this project is currently being completed along with the SUL finalisation project, including the global rename tool (bug 14862) and cleaning up the CentralAuth database (bug 66535).

Search [edit]

CirrusSearch is running as the default search engine on all but the highest traffic wikis at this point. Nik Everett and Chad Horohoe plan to migrate most of the remaining wikis in July, leaving only the German and English Wikipedia to migrate in August.

Auth systems [edit]

Continued work on the SOA Authentication RFC and Phabricator OAuth integration. We made OAuth compatible with HHVM and made other minor bug fixes.

SUL finalisation [edit]

The MediaWiki Core team has committed to having the following work completed by the end of September 2014:
  • Completing the necessary engineering work to carry out the finalisation.
  • Setting a date on which the finalisation will occur (Note: this date may be after September).
  • Have a communications strategy in place, and community liaisons to carry that out, for the time period between the announcement of the date of the finalisation and the finalisation proper.

Security auditing and response [edit]

We released MediaWiki 1.23.1 to prevent multiple issues caused by loading external SVG resources. We also performed security reviews of the Wikidata property suggester, Extension:Mantle for mobile/Flow, and Flow's templating rewrite.

Quality assurance

edit

Quality Assurance [edit]

This month saw significant improvements to the MediaWiki-Vagrant development environments from new WMF staff member Dan Duvall. We have completed support for running the full suite of browser tests on a Vagrant instance under the VisualEditor role. In the near future, we will extend that support to the MobileFrontend and Flow Vagrant roles, as well as making general improvements to Vagrant overall. Another great QA project is from Google Summer of Code intern Vikas Yaligar, who is using the browser test framework to automate taking screen captures of aspects of VisualEditor (or any other feature) in many different languages, for the purpose of documentation and translation.

Quality Assurance/Browser testing [edit]

After two years of using a third-party host to run browser test builds in Jenkins, this month we have completed the migration of those builds to Jenkins hosted by the Wikimedia Foundation. Hosting our browser test builds ourselves gives us more control over every aspect of running the browser tests, as well as the potential to run them faster than previously possible. Particular thanks to Antoine Musso, whose work made it possible. Simultaneously, we have also ported all of the remaining tests from the /qa/browsertest repository either to /mediawiki/core or to their relevant extension. This gives us the ability to package browser-based acceptance tests with the release of MediaWiki itself. After more than two years evolving the browser testing framework across WMF, the /qa/browsertests repository is retired, and all if its functions now reside in the repositories of the features being tested.

Multimedia

edit

Multimedia [edit]

In June, the multimedia team released Media Viewer v0.2 on all Wikimedia wikis, with over 20 million image views per day on sites we track. Global feedback was generally positive and helped surface a range of issues, many of which were addressed quickly. Based on this feedback, Gilles Dubuc, Mark Holmquist, and Gergő Tisza developed a number of new features, with designs by Pau Giner: view images in full resolution, view images in different sizes, show more image information, edit image file pages, as well as easy disable tools for anonymous users and editors.

This month, we started working on the Structured Data project with the Wikidata team, to implement machine-readable data on Wikimedia Commons. We are now in a planning phase and aim to start development in Fall. We ramped up our work on UploadWizard, reviewed user feedback, collected metrics, fixed bugs and started code refactoring, with the help of contract engineer Neil Kandalgaonkar. We also kept working on technical debt and bug fixes for other multimedia tools, such as image scalers, GWToolset and TimedMediaHandler, with the help of Summer contractor Brian Wolff.

As product manager, Fabrice Florin helped plan our next steps, hosting a planning meeting and other discussions of our development goals, and led an extensive review of user feedback for Media Viewer and UploadWizard with new researcher Abbey Ripstra. Community liaison Keegan Peterzell introduced Media Viewer and responded to user comments throughout the product's worldwide release. To learn more about our work, we invite you to join our discussions on the multimedia mailing list.

Bug management [edit]

Apart from gruntwork (handling new tickets; prioritizing tickets; pinging on older tickets) and Andre's main focus on Phabricator, Parent5446, Krinkle and Andre created several requested Bugzilla components, plus moved 'MediaWiki skins' to a Bugzilla product of their own. In Bugzilla's codebase, Tony and TTO styled Bugzilla's Alias field differently, Tony removed the padlock icons for https links in Bugzilla and cleaned up the codebase, and Odder fixed a small glitch in Bugzilla's Weekly Summary and rendering of custom queries on the Bugzilla frontpage. Numerous older tickets with high priority were triaged on a bugday.

Phabricator/Migration [edit]

Apart from discussions on how to implement certain functionality and settings in Phabricator among team members and stakeholders, Mukunda implemented a MediaWiki OAuth provider in Phabricator (Gerrit changes: 1, 2; related ticket) and Chase created a Puppet module for Phabricator.

Mentorship programs [edit]

Google Summer of Code and FOSS Outreach Program for Women interns and mentors evaluated each other as part of the mid-term evaluations. Reports are available for all projects:

Technical communications [edit]

In addition to ongoing communications support for the engineering staff, Guillaume Paumier focused on information architecture of Wikimedia engineering activities. This notably involved reorganizing the Wikimedia Engineering portal (now linked from MediaWiki.org's sidebar) and creating a status dashboard that lists the status of all current activities hosted on MediaWiki.org. The portal is now also cross-linked with the other main tech spaces (like Tech and Tech News) and team hubs.

Volunteer coordination and outreach [edit]

Volunteers and staff are beginning to add or express interest in topics for the 2014 Wikimania Hackathon in London. The WMUK team is working hard to finalize venue logistics so that we can schedule talks and sessions in specific rooms. Everything is on track for a successful (and very large!) Hackathon. Tech Talks held in June: How, What, Why of WikiFont on June 12 and A Few Python Tips on June 19. A new process has been set up for volunteers needing to sign an NDA in order to be granted special permissions in Wikimedia servers. On a similar note, we have started a project to implement a Trusted User Tool in Phabricator, in order to register editors of Wikimedia projects that have been granted special permissions after signing a community agreement.

Architecture process [edit]

Developers had several meetings on IRC about architectural issues or Requests for comment:

Analytics/Wikimetrics [edit]

To support Editor Engagement Vital Signs, the team has implemented a new metric: Newly Registered User. There is also a new backup system to preserve user's reports on cohorts as well as the ability to tag cohorts. A number of bugs have been fixed, including fixing the first run of a recurrent report and preventing the creation of reports with invalid cohorts.

Analytics/Data Processing [edit]

The team has now integrated Data Processing as part of its Development Process. New Stories/Features have been identified and tasked. Also, experimentation with Cloudera Hadoop 5 is complete and we are ready to upgrade the cluster in July.

Analytics/Editor Engagement Vital Signs [edit]

The ability to run a metric over an entire project (wiki) in Wikimetrics drives us closer to producing data daily for our first Vital Sign. The team has also iterated on the design of the dashboard and navigation. We added a requirement from executives to have a default view when EEVS is loaded. This view would display metrics for the 7 largest Wikipedias.

Analytics/EventLogging [edit]

We fixed a serious bug where cookie data was getting captured in the country column. Saved data was scrubbed of the unwanted information and some old and unused tables were dropped. The team also implemented Throughput Monitoring to help catch potential issues in EventLogging.

Analytics/Research and Data [edit]

This month we refined the Editor Model – a proposal to model the main drivers of monthly active editors – and expanded the documentation of the corresponding metric definitions. We applied this model to teams designing editor engagement features (Growth, Mobile) and supported them in setting targets for the next fiscal year.

We analyzed the early impact of the tablet desktop-to-mobile switchover on traffic, edit volume, unique editors, and new editor activation.

We hosted the June 2014 edition of the research showcase with two presentations on the effect of early socialization strategies and on predictive modeling of editor retention.

We released wikiclass, a library for performing automated quality assessment of Wikipedia articles.

We released longitudinal data on the daily edit volume for all wikis with VisualEditor enabled, since the original rollout.

We continued work on an updated definition for PageViews.

Finally, we held our quarterly review (Q4-2014) and presented our goals for the next quarter (Q1-2015).

The Wikidata project is funded and executed by Wikimedia Deutschland.

The team worked on fixing bugs as well as a number of features. These include data access for Wikiquote, support for redirects, the monolingual text datatype as well as further work on queries. Interface messages where reworked to make them easier to understand. First mockups of the new interface design have been published for comments. The entity suggester a team of students worked on over the last months has been deployed. This makes it easier to add new statements by suggesting what kind of statements are missing on an item. Wikidata the Game has been extended by Magnus by 2 games to add date of birth and date of death to people as well as to add missing images.

Future

edit
The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.