Wikimedia Engineering/Report/2014/April
Engineering metrics in April:
- 158 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits went from around 1315 to about 1305.
- About 30 shell requests were processed.
Major news in April include:
- the change of format of MediaWiki localization files from PHP to JSON, and the associated modernization of the LocalisationUpdate extension;
- the move of Wikimedia Labs to a new data center;
- the “Heartbleed” security vulnerability and how the Wikimedia Foundation's team responded to it;
- an explanation of how the Mobile team uses Trello to plan their development sprints;
- a project report on a grant to create "gadgets" for VisualEditor.
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Upcoming events
editThere are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.
For a more complete and up-to-date list, check out the Project:Calendar.
Date | Type | Event | Contact |
---|---|---|---|
9 May 2014–11 May 2014 | Zürich Hackathon 2014 (Zürich, Switzerland) | Manuel Schneider | |
14 May 2014 | Moving to Phabricator (Latest news after Hackathon); 18:00-19:00UTC in #wikimedia-office on IRC | Andre Klapper | |
15 May 2014 | Tech Talk: Elasticsearch at 19:00 UTC | Nik Everett | |
19 May 2014 | Performance guidelines at 19:00 UTC in #mediawiki connect | Sumana Harihareswara | |
21 May 2014 | Language Engineering monthly office hour | Runa Bhattacharjee | |
21 May 2014 | Making Wikipedia Fast (San Francisco, USA) Video: Google+ - YouTube. Questions at #wikimedia-dev connect | Ori Livneh, Aaron Schulz | |
21 May 2014–23 May 2014 | SMWCon Spring 2014 (Montreal, Canada) | Conference talk | |
30 May 2014–1 June 2014 | WikiConference USA (New York, USA) | wikicon@wikimedianyc.org |
Personnel
editAre you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- VP of Engineering
- ScrumMaster
- Software Engineer - VisualEditor Team
- Software Engineer - Fundraising Team
- Software Engineer - Internationalization
- Software Engineer (Partners)
- QA Tester
- Research Analyst - Fundraising
- Product Manager - Language Engineering
- Operations Security Engineer
- Data Center Engineer (Contractor)
Announcements
edit- Giuseppe Lavagetto joined the Operations team as Operations Engineer (announcement).
- Aaron Schulz is now Senior Performance Engineer (announcement).
- Dmitry Brant joined the the Mobile App Team as Software Developer (announcement).
- Danny Horn joined the Product Development team as Product Manager (announcement).
Technical Operations
edit- The Wikimedia Foundation has chosen a winning RFP bid, a contract has been executed and implementation is underway. A public announcement is being prepared in the upcoming week.
Wikimedia Labs
Labs metrics in April:
- Number of projects: 153
- Number of instances: 345
- Amount of RAM in use (in MBs): 1,454,592
- Amount of allocated storage (in GBs): 16,515
- Number of virtual CPUs in use: 716
- Number of users: 3,064
- The migration of Labs and Tool Labs to the Ashburn data center is complete, and most of the hardware in Tampa has been shut down and packed up for shipping to the new (to be announced) data center.
- Post-migration, many projects which had public IPs are now relying on the internal Labs web proxy instead. That has caused a few unexpected bugs in project web access, but provides several benefits including HTTPS access and increased user data privacy.
Tampa data center
- During the last month, our data center footprint in Tampa has been reduced to just 6 racks, reduced from 24 total. A copy of all essential data remains present in the Tampa facility until we've finished setting up the relevant services in our upcoming new data center.
Editor retention: Editing tools
editCore Features
editGrowth
editSupport
editDuring the last month, the team continued setup tasks on the Partners portal, JSON configuration store, and graceful image quality reduction. The team also updated Android and iOS Wikipedia app reboot visual flourishes for Wikipedia Zero, analyzed anomalous access patterns and proxy-oriented configuration and tech documentation to close gaps, and created bugfixes for unnecessary charge warnings in the "Read in another language" language picker plus direct upload.wikimedia.org image hyperlinks on File: pages. The team also removed some legacy ETL code from the ZeroRatedMobileAccess extension.
Yuri did outreach abroad and continued analytics work on SMS/USSD pilot data. The team also generated two custom pageview analyses for an operator to distinguish traffic by high level device access characteristics as part of ongoing discussions. The team also explored legacy Android Wikipedia app trends.
Additionally, the team cut Android Wikipedia app alpha builds, worked on User-Agent string and URL format updates for the forthcoming iOS Wikipedia app to ensure pageview logging, and performed app code review.
Discussion with the community on MCC-MNC logging to address mobile IP drift was conducted, and it appears it is okay to proceed; the team will reduce the date granularity of log lines to the day (e.g., YYYYMMDD) with a patch to MediaWiki core, though.
Routine pre- and post-launch configuration changes were made to support operator zero-rating, and in-depth technical assistance was provided to operators and the partner management team to help add zero-rating and address anomalies.
The team emailed further about full-text search in reboots of Wikipedia apps, and may resume investigation of it later.
The team also examined requirements for portal and general partners engineering human resources.Wikipedia Zero (partnerships)
- IPKO in Kosovo launched Wikipedia Zero, bringing us to a total of 28 partners in 26 countries. We delivered 68 million free page views in April. Adele Vrana visited South Africa to meet with MTN (current Wikipedia Zero partner), prospective partners, members of Wikimedia South Africa (WMZA) and the Singenjongo High School. This trip was part of a broader strategy to promote Wikipedia in our partners' corporate social responsibility (CSR) and education initiatives, increasing awareness and impact locally. We are identifying new collaboration opportunities with MTN and local organizations, including the Wikimedia chapter in South Africa and other mission-aligned nonprofits. Additionally, we will continue to support the local initiative created by Sinenjongo High School teachers and students.
Language engineering communications and outreach
- A beta feature that shows a red interlanguage link when the article is not translated to the user's language;
- Basic handling of templates and images;
- Basic publishing of the translation as a formatted article;
- Testing infrastructure for the server.
MediaWiki Core
editSecurity auditing and response
Quality assurance
editQuality Assurance/Browser testing
Multimedia
editIn April, the multimedia team released Media Viewer v0.2 on 14 pilot sites, in preparation for a wider deployment next month: overall response has been favorable so far, and a growing majority of survey respondents are finding this new multimedia browser useful. Gilles Dubuc, Mark Holmquist, Gergő Tisza and Aaron Arcos developed final features for this release, as described on this release's wall, based on designs by Pau Giner. We also developed a set of metrics dashboards to track global activity, image load and network performance, as well as local metrics dashboards for selected sites: first results show a decline in image load time, and suggest that Media Viewer loads faster than file description pages. We invite you to test the latest version of Media Viewer (see these testing tips) and share your feedback.
Fabrice Florin led product planning and management, hosting a planning meeting for our next development cycle (leading to a wall of tasks): for the next six weeks, we plan to divide our time between Media Viewer (e.g. serious bugs, basic zoom feature), Technical Debt (e.g. image scalers) and Upload Wizard. Keegan Peterzell and Fabrice announced the gradual release of Media Viewer on dozens of wiki sites, starting new discussions in collaboration with our community partners, as well as launching surveys in multiple languages to get reader feedback about this tool. For more updates about our multimedia work, we invite you to join the multimedia mailing list.Project management tools/Review
Volunteer coordination and outreach
Analytics/Logging infrastructure
We continued to provide support for the Vital Signs project by working with the Dev team on metrics and code requirements, as well as visualization and data presentation options.
Aaron Halfaker presented his work on Snuggle – an observation and mentoring system for new Wikipedians – at the 2014 ACM Conference on Human Factors in Computing Systems (CHI '14).
We published longitudinal data on editor activation and mobile vs desktop new user acquisition across the largest Wikipedias.
We posted a job opening for a full-time Research Analyst to support Fundraising and become part of our team.
We started work on the editor lifecycle and editor trajectories, with the goal of understanding the drivers of active editors and power editors and modeling the survival of contributors to WIkimedia projects.
We provided ad-hoc support to the Product team for the onboarding of the new Executive Director.
This month we also released tools to perform analysis of Wikimedia data. Aaron Halfaker published a Python library called mediawiki-utilities for extracting and processing data from MediaWiki installations, slave databases and xml dumps. Oliver Keyes released WikipediR, an R wrapper for the MediaWiki API, aimed particularly at the Wikimedia 'production' wikis, such as Wikipedia.The Kiwix project is funded and executed by Wikimedia CH.
- We have finally released a first experimental ZIM file of TED talks. We have packed in a unique 7GB library the 250 talks about business. This includes not only the videos, but also short speaker bios and thousands of subtitles in more than 50 languages. We will fix soon the last critical issues and release other TED ZIM files with talks about entertainment, science, etc.
- We have also migrated our download server to a better one. Besides providing a better storage system and more bandwidth, it has 9TB of disk space. This was a mandatory step in our ZIM generation industrializing process and therefore necessary to allow us to generate more ZIM files more often.
- We are also working with an e-reader manufacturer to have Kiwix installed and available on its devices so that the MALeBOOKS project (eBooks for Mali) can ship e-readers with not only thousands of free eBooks, but also the complete Wikipedia and Wikisource in French.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- The Wikidata team got simple query functionality ready for a first demo at the WMF Metrics and Activities Meeting. The entity suggester a team of students is working on also got finishing touches and should be ready for release soon. Once it is live, it will suggest missing information on an item so it is easier to see what should be added. We also welcomed 2 interns as part of the Outreach Program for Women to help with documentation, social media outreach and mobile app concepts. Wikiquote now manages its language links via Wikidata. Additionally it is now possible to automatically add links to other sister projects in the sidebar of an article using Wikidata. A Wikidata Toolkit was released, as well as a library that lets you use Wikidata's labels for translation anywhere on the web.
Future
edit- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.