Wikimedia Engineering/Report/2014/July
Note: We're also providing a shorter and translatable version of this report.
Engineering metrics in July:
- 164 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits went from around 1575 to about 1642.
- About 31 shell requests were processed.
Upcoming events
editThere are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.
For a more complete and up-to-date list, check out the Project:Calendar.
Date | Type | Event | Contact |
---|---|---|---|
6 August 2014–7 August 2014 | Wikimania 2014 Hackathon (London, England) | Wikimania 2014 Hackathon | |
13 August 2014 | IRC discussion of several RfCs for next actions, 2100-2200 UTC in #wikimedia-office connect. | Sumana Harihareswara |
Personnel
editAre you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- VP of Engineering
- Software Engineer - Front-end (VisualEditor)
- Software Engineer - Services
- Software Engineer - Front-end
- Software Engineer - Maps & Geo
- Software Engineer - Mobile - iOS
- QA Tester
- Software Engineer - Full Stack
- Lean/Agile Coach
- Product Manager
- Product Manager - Language Engineering
- Operations Security Engineer
- UX Senior Designer
- UX Senior Design Researcher
- UX User Research Recruiter
- Project Coordinator - Engineering
- Mobile Partnerships Regional Manager
- Program Evaluation Internship
Announcements
edit- Arthur Richards is now Team Practices Manager (announcement).
- Kristen Lans joined the Team Practices Group as Scrum Master (announcement).
- Joel Sahleen joined the Language Engineering team as Software Engineer (announcement).
Technical Operations
editDallas data center
- Throughout July, the cabling work of all racked servers and other equipment was nearly completed. We're still awaiting the installation of the first connectivity to the rest of our US network in early August before we can begin installation of servers and services.
San Francisco data center
- Due to a necessary upgrade to power & cooling infrastructure in our San Francisco data center (which we call ulsfo), our racks have been migrated to a new floor within the same building on July 9. The move completed in a very smooth fashion without user impact, and the site was brought back online serving all user traffic again in less than 24 hours.
PFS enabled
- Through the help of volunteer work and research, our staff enabled Perfect Forward Secrecy on our SSL infrastructure, significantly increasing the security of encrypted user traffic.
Labs metrics in July:
- Number of projects: 173
- Number of instances: 464
- Amount of RAM in use (in MBs): 1,933,824
- Amount of allocated storage (in GBs): 20,925
- Number of virtual CPUs in use: 949
- Number of users: 3,500
Wikimedia Labs
- We've made several minor updates to Wikitech: we added OAuth support, fixed a few user interface issues, and purged the obsolete 'local-*' terminology for service groups.
- OPW Intern Dinu Sandaru has set forms for structured project documentation. This should will help match new volunteers with existing projects, and will make communication with project administrators more straightforward.
- Sean Pringle is in the process of updating the Tool Labs replica databases to MariaDB version 10.0. This may reduce replag, and should improve performance and reliability.
- We're setting up new storage hardware for the project dumps. This will resolve our ongoing problems with full drives and out-of-date dumps.
Editor retention: Editing tools
editThe new design, with controls focussed at the top of each window in consistent positions, was made possible due to the significant progress made in cross-platform support in the UI library, which now provides responsively-sized windows that can work on desktop, tablet and phone with the same code. HTML comments are occasionally used on a few articles to alert editors to contentious or problematic issues without disrupting articles as they are read, so making them prominently visible avoids editors accidentally stepping over expected limits. Re-using citations is now provided with its simple dialog available in the toolbar so that it is easier for users to find.
Other improvements include an array of performance fixes targeted at helping mobile users especially, fixing a number of minor instances where VisualEditor would corrupt the page, and installing better monitoring of corruptions if they occur, and better support for right-to-left languages, displaying icons with the right orientation based on context.
The mobile version of VisualEditor, currently available for beta testers, moved towards stable release, fixing a number of bugs and editing issues and improving loading performance. Our work to support languages made some significant gains, nearing the completion of a major task to support IME users, and the work to support Internet Explorer uncovered some more issues as well as fixes. The deployed version of the code was updated five times in the regular release cycle (1.24-wmf12, 1.24-wmf13, 1.24-wmf14, 1.24-wmf15 and 1.24-wmf16).
In wider news, the team expanded its scope to cover all MediaWiki editing tools as well, as the new Editing Team (covered below).The biggest Editing change this month was in the Cite extension (for footnotes) – this now automatically shows a references list at the end of the page if you forget to put in a <references />
tag, instead of displaying an ugly error message. The Math extension (for formulæ) was improved with more rigorous error handling and LaTeX formula checking, as part of the long-term volunteer-led work to introduce MathML-based display and editing. The TemplateData GUI editor was deployed to a further six wikis – the English, French, Italian, Russian, Finnish and Dutch Wikipedias.
With an eye towards supporting Parsoid-driven page views, the Parsoid team strategized on addressing Cite extension rendering differences that arise from site-messages based customizations and is considering a pure CSS-based solution for addressing the common use cases. We also finished work developing the test setup for doing mass visual diff tests between PHP parser rendering and Parsoid rendering. It was tested locally and we started preparations for deploying that on our test servers. This will go live end-July or early-August.
The GSoC 2014 LintTrap project continued to make good progress. We had productive conversations with Project WikiCheck about integrating LintTrap with WikiCheck in a couple different ways. We hope to develop this further over the coming months.
Overall, this was also a month of reduced activity with Gabriel now officially full time in the Services team and Scott focused on the PDF service deployment that went live a couple days ago. The full team is also spending a week at a off-site meeting working and spending time together in person prior to Wikimania in London.Services
editThe brand new Services group (currently Matt Walker and Gabriel Wicke) started July with two main projects:
- PDF render service deployment
- Design and prototyping work on the storage service and REST API
The PDF render service is now deployed in production, and can be selected as a render backend in Special:Book. The renderer does not work perfectly on all pages yet, but the hope is that this will soon be fixed in collaboration with the other primary author of this service, C. Scott Ananian.
Prototyping work on the storage service and REST API is progressing well. The storage service now has early support for bucket creation and multiple bucket types. We decided to configure the storage service as a backend for the REST API server. This means that all requests will be sent to the REST API, which will then route them to the appropriate storage service without network overhead. This design lets us keep the storage service buckets very general by adding entry point specific logic in front-end handlers. The interface is still well-defined in terms of HTTP requests, so it remains straightforward to run the storage service as a separate process. We refined the bucket design to allow us to add features very similar to Amazon DynamoDB in a future iteration. There is also an early design for light-weight HTTP transaction support.
Matt Walker is sadly leaving the Foundation by the end of this month to follow his passion of building flying cars. This means that we currently have three positions open in the service group, which we hope to start filling soon.
Core Features
editGrowth
editIn July, the Growth team completed its second round of A/B testing of signup invitations for anonymous editors on English Wikipedia, including data analysis. The team also built the first API and interface prototypes for task recommendations. This new system, first aimed at brand new editors, makes suggestions based on a user's previous edits.
On Android this month we released to production accessibility and styling features which were requested by our users, such as a night mode for reading in the dark and a font size selector. We also released an onboarding screen that asks users to sign up.
Our plan for next month is to get user feedback from Wikimania, wrap up our styling fixes, and begin work on an onboarding screen the first time that someone taps edit.In side project work, the team spent time on API continuation queries, Android IP editing notices, Amazon Kindle and other non-Google Play distribution, and Google Play reviews (now that the Android launch dust has settled, mobile apps product management will be triaging the reviews). In partnerships work, the team met with Mozilla to talk about future plans for the Firefox OS HTML5 app (e.g., repurposing the existing mobile website, but without any feature reduction) and how Wikimedia search might be further integrated into Firefox OS, and also spoke with Canonical about how Wikipedia might be better integrated into the forthcoming Ubuntu Phone OS.
Routine pre- and post-launch configuration changes were made to support operator zero-rating, with routine technical assistance provided to operators and the partner management team to help add zero-rating and address anomalies. The team also continued its search for a third Partners engineering teammate.Wikipedia Zero (partnerships)
- We served an estimated 68 million free page views in July through Wikipedia Zero. We continue to bring new partners into the program, though none launched in July. Adele Vrana met with prospective partners and local Wikimedians in Brazil. We published our operating principles to increase transparency.
Language engineering communications and outreach
MediaWiki Core
editTo help users with local-only accounts that are going to be forcibly renamed due to the SUL finalisation, the team is working on a form that lets those users request a rename. These requests will be forwarded onto the stewards to handle. The SUL team is currently in consultation with the stewards about how they would like this tool to work. When this consultation is wrapped up, the team will begin design and implementation.
To help users get globally renamed without having to request renames on potentially hundreds of wikis, the team implemented and deployed GlobalRenameUser, a tool which renames users globally. As the tool is designed to work post-finalisation, it only performs renames where the current name is global, and the requested name is totally untaken (no global account and no local accounts exist with that name).
To help users who get renamed by the finalisation and, despite our best efforts to reach out to them, did not get the chance to request a rename before the finalisation, the team is working on a feature to let users log in with their old credentials. The feature will display an interstitial when they log in, informing them that they logged in with old credentials and that they need to use new ones. We are also considering a persistent banner for those users, so that they definitely know they need to use their new credentials. An early beta version of this feature is complete, and now needs design and product refinements to be completed.
To help users who get renamed by the finalisation and, as a result, have several accounts that were previously local-only turned into separate global accounts, the team is working on a tool to merge global accounts. We chose to merge accounts as it was the easiest way to satisfy the use case without causing further local-global account clashes that would cause us to have to perform a second finalisation. The tool is in its preliminary stages.
The team also globalised some accounts that were not globalised but had no clashes. These accounts were either created in this local-only form due to bugs, or are accounts from before CentralAuth was deployed where the user never globalised. As these accounts had no clashes, there were no repercussions to globalising these accounts, so we did this immediately.
At present, no date has been chosen for the finalisation. The team plans to have the necessary engineering work done by the end of the quarter (end of September 2014), and have a date chosen by then.
Next month the team plans to continue work on these features.Security auditing and response
Wikimedia Release Engineering Team
A lot of progress was made on making Phabricator suitable as a task/bug tracking system for Wikimedia projects. You can see the work to be sorted and completed at this workboard.
The Beta Cluster now runs with HHVM, bringing us much closer to full HHVM deployment. In addition, the Language Team deployed the new Content translation system on the Beta Cluster with the help of the Release Engineering team.
The second round of public RFP for third-party MediaWiki release management was conducted and concluded.
We now no longer use the third-party Cloudbees service for any of our Jenkins jobs and run all jobs locally. This will enable us to better diagnose issues with our build process, especially as it pertains to our browser tests (which still mostly run on SauceLabs).browsertests
repository to the repositories of the extensions being tested in June, as well as porting a significant set of tests to MediaWiki core itself, we completely retired the Jenkins instance running on a third-party host in favor of running test builds from the Wikimedia Jenkins instance, and we deleted the /qa/browsertests
code repository. These moves are the result of more than two years of work. In addition, we have added more functions to the API wrapper used by browser tests, improved support for testing in Vagrant virtual machines, added new Jenkins builds for extensions, and improved the function of the beta labs test environments by preventing database locks and stopping users from being logged out by accident.Quality Assurance/Browser testing
Multimedia
editAs described in our improvements plan, these new features are being prototyped and will be carefully tested with target users in August, so we can validate their effectiveness before developing and deploying them in September. You can see some of our thinking in this presentation.
This month, we continued to work on the Structured Data project with the Wikidata team and many community members, to implement machine-readable data on Wikimedia Commons. We prepared to host a range on online and in-person discussions to plan this project with our communities, and aim to develop our first experiments in October, based on their recommendations. We also continued a major code refactoring for the UploadWizard, as well as fixed a number of bugs for some of our other multimedia tools.
Last but not least, we prepared seven different multimedia roundtables and presentations for Wikimania 2014, which we will report on in more depth in August. For now, you can keep up with our work by joining the multimedia mailing list.Mukunda implemented restricting access to tasks in a certain project which can be tested on fab.wmflabs.org. As a followup, he investigated enforcing security policy also on files and attachments and replacing the IRC bots by Phab's chatbot. Chase worked on initial migration code to import data from Bugzilla reports into Phabricator tasks (and ran into missing API code in Phabricator), investigated configuring Exim for mail, set up a data backup system for Phabricator, and upgraded the dedicated Phabricator server to Ubuntu Trusty. Quim started documenting Phabricator.
Andre helped making decisions on defining field values and how to handle certain Bugzilla fields in the import script and sent a summary email to wikitech-l about the Phabricator migration status.- Tools for mass migration of legacy translated wiki content
- Wikidata annotation tool
- Email bounce handling to MediaWiki with VERP
- Google Books, Internet Archive, Commons upload cycle
- UniversalLanguageSelector fonts for Chinese wikis
- MassMessage page input list improvements
- Book management in Wikibooks/Wikisource
- Parsoid-based online-detection of broken wikitext
- Usability improvements for the Translate extension
- A modern, scalable and attractive skin for MediaWiki
- Automatic cross-language screenshots for user documentation
- Separating skins from core MediaWiki
- Chemical Markup support for Wikimedia Commons
- Improving URL citations on Wikimedia
- Historical OpenStreetMap
- Welcoming new contributors to Wikimedia Labs and Tool Labs
- Evaluating, documenting, and improving MediaWiki web API client libraries
- Feed the Gnomes - Wikidata Outreach
- Template Matching for RDFIO
- Switching Semantic Forms Autocompletion to Select2
- Catalogue for Mediawiki Extensions
- Generic, efficient localisation update service.
Volunteer coordination and outreach
- 2014-07-10 — Frontend standardization discussion focusing on Requests for comment/Redo skin framework;
- 2014-07-16 — RfC discussion focusing on Requests for comment/Vertical writing support;
- 2014-07-23 — RfC discussion focusing on Requests for comment/Composer managed libraries for use on WMF cluster, in which the architecture committee approved the RfC;
- 2014-07-30 — RfC discussion focusing on Requests for comment/CentralNotice Caching Overhaul - Frontend Proxy.
Analytics/Editor Engagement Vital Signs
We analyzed trends in mobile readership and contributions, with a particular focus on the tablet switchover and the release of the native Android app. We found that in the first half of 2014, mobile surpassed desktop in the rate at which new registered users become first-time editors and first-time active editors in many major projects, including the English Wikipedia. An update on mobile trends will be presented at the upcoming Monthly Metrics meeting on July 31.
Development of a standardised toolkit for geolocation, user agent parsing and accessing pageviews data was completed.
We supported the multimedia team in developing a research study to objectively measure the preference of Wikipedia editor and readers.
We hosted the July research showcase with a presentation by Aaron Halfaker of 4 Python libraries for data analysis, and a guest talk by Center for Civic Media's Nathan Matias on the use of open data to increase the diversity of collaboratively created content.
We prepared 8 presentations that we will be giving or co-presenting next week at Wikimania in London. We also organized the next WikiResearch hackathon that will be jointly hosted in London (UK) (during the pre-conference Wikimania Hackathon) and in Philadelphia (USA) on August 6-7, 2014.
We filled the fundraising research analyst position: the new member of the Research & Data team will join us in September and we'll post an announcement on the lists shortly before his start date.
Lastly, we gave presentations on current research at the Wikimedia Foundation at the Institute for Scientific Interchange (Turin) and at the DesignDensity lab (Milan).The Kiwix project is funded and executed by Wikimedia CH.
- We have pre-release binaries of the next 0.9 (final) release. Except for OSX everything seems to work file as far. The support of RaspberryPi was finally merged to the kiwix-plug master branch; this offers new perspectives because the price to create a Kiwix-Plug has dropped to around USD 100. We also started an engineering collaboration with ebook reader manufacturer Bookeen (in the scope of the Malebooks project) to be able offer an offline version of Wikipedia on e-ink devices.
- We participated in the Google Serve Day at Google Zurich. The goal was to meet Google engineers during one day and have them work on open source projects. The result was a dozen of fixed bugs and implemented features, mostly on Kiwix for Android, but also in Kiwix for desktop and MediaWiki.
- Four developers had a one-week hackathon in Lyon, France to develop an offline version of the Gutenberg library. We're currently polishing the code and plan a release soon; our partners and sponsors plan the first deployments in Africa in Autumn.
- Last but not least, a proof-of-concept of a Kiwix iOS app was made, so we might release a first app before the end of the year.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- The biggest improvement around Wikidata in July is the release of the entity suggester. It makes it a lot easier to see what kind of information is missing on an item. Helen and Anjali, Wikidata's Outreach Program for Women interns, continued improving user documentation and outreach around Wikidata as well as worked on a new design for the main page. Guided Tours were published, helping newcomers find their way around the site. The developers further worked on supporting badges (like "featured article"), redirects between items, the monolingual text datatype (to be able to express things like the motto of a country) as well as the first implementation steps for the new user interface design. Additionally the first JSON dumps were published.
Future
edit- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.