Wikimedia Engineering/Report/2014/February
Engineering metrics in February:
- 149 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits went from around 1320 to about 1453.
- About 22 shell requests were processed.
Major news in February include:
- a call for volunteers to test the upcoming multimedia viewer;
- improvements to VisualEditor's media and template editors;
- the launch of the Flow discussion system on two pilot talk pages on the English Wikipedia;
- the launch of GettingStarted to 30 new language versions of Wikipedia, including all of the top 10 projects by number of page views;
- improvements to the tools and process used to deploy code to Wikimedia production sites;
- the release of the first archive of the entire English Wikipedia with thumbnails, for offline use.
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Personnel
editAre you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- VP of Engineering
- Software Engineer - Growth
- Software Engineer - VisualEditor (Features)
- Software Engineer - Language Engineering
- Software Engineer- Mobile (Frontend)
- Software Engineer - Mobile (Android Apps)
- Automation Engineer
- Release Engineer
- Director of Community Engagement (Product)
- Product Manager - Multimedia
- Operations Security Engineer
Announcements
edit- Leila Zia joined the Analytics team as Research Scientist (announcement).
- Faidon Liambotis was promoted to Principal Operations Engineer (announcement).
- YuFei Liu joined the UX Design team as Visual Design Intern (announcement).
- Following changes in the Language engineering team, Amir Aharoni is now the Acting Product Manager, and Runa Bhattacharjee the ScrumMaster (announcement).
Technical Operations
edit- Final negotiations have completed with the 3 remaining data center bids in February, and the Wikimedia Operations team will make a decision in the first week of March. Expect a public announcement soon.
Wikimedia Labs
Labs metrics in February:
- Number of projects: 129
- Number of instances: 458
- Amount of RAM in use (in MBs): 1,812,992
- Amount of allocated storage (in GBs): 24,540
- Number of virtual CPUs in use: 906
- Number of users: 2,714
- The Wikimedia Labs infrastructure in the eqiad data center has been deployed with the OpenStack Havana release, and testing completed in February. Labs users will have 2 weeks to migrate their own projects & instances starting in March. During the last two weeks of March, the Wikimedia Operations team will handle the transfer of the remaining instances that have not been migrated by users themselves.
ulsfo redeployment
- During a short deployment of our West Coast data center ulsfo in October 2013 several reliability problems were found with some of our network service providers, which forced us to take this site out of service until they could be resolved. We have worked since to improve reliability and increase redundancy of network transit and transport to this site. As of the week of February 3rd ulsfo is in full production usage again, and is now serving traffic for the US west coast, Oceania and large parts of Asia. A blog post is being prepared describing the improvements in user perceived site performance.
eqiad data center capacity expansion
- The Wikimedia Foundation has expanded the capacity of its main data center site eqiad in Ashburn, Virginia by 33%. A fourth row of racks has been added, and all power & networking infrastructure has been installed and configured in February. The added rack space is available for new equipment as of February 24th.
Editor retention: Editing tools
editPart of the team has continued to mentor two Outreach Program for Women (OPW) interns. This internship ends mid-March. Others are mentoring a group of students in a Facebook Open Academy project to build a Cassandra storage back-end for the Parsoid round-trip test server.
We have a first version of a Debian package for Parsoid ready. This package is yet to find a home base (repository) from which it can be installed. This will soon make the installation of Parsoid as easy asapt-get install parsoid
.Core Features
editGrowth
editSupport
editWikipedia Zero (partnerships)
- In February, we launched Wikipedia Zero with MTN South Africa (Opera Mini browser only). MTN South Africa responded directly to the kids of Sinenjongo High School with an open letter to the students and the youth of South Africa. They said they agree that Wikipedia could give a boost to their education system, and that offering Wikipedia Zero is a small thing that could change everything (see video on YouTube).
- We also launched Wikipedia Zero with Safaricom, the largest operator in Kenya. We now have three partners in Kenya, covering 90% of all mobile subscribers. South Africa is our 23rd country to launch, and Safaricom is our 27th operator partner.
- The Mobile Partnerships team attended Mobile World Congress in Barcelona, where we met with existing operator partners, prospective partners and tech companies who want to support the mission. At the conference, our Wikipedia Text pilot with Airtel Kenya and the Praekelt Foundation was nominated as a finalist for the GSMA Global Mobile awards in the education category.
More convenient shortcuts were added by Niklas Laxström to the Translate extension.
Kartik Mistry and Amir Aharoni are working on stabilizing the browser tests for all the language extensions and on setting up more robust online staging sites.Language engineering communications and outreach
MediaWiki Core
editext_zend_compat
) under HipHop, with the goal of using it for our Lua module. Ori Livneh is working on packaging and deployment issues, as well as generally wrangling the overall development effort. Aaron Schulz is starting to investigate what is needed for wmferrors
support.The Release and QA team had their latest quarterly review on February 13. Highlights from the meeting include:
- We will be hiring two new positions (a QA Automation Engineer and a Test Infrastructure Engineer).
- We will process through all pain points from the Development and Deployment process review.
- We will continue performing incremental improvements to the current deployment script (known as "scap") to better inform future deployment tooling work.
- We will create a way for tests to create fake/stub data (for use in throw-away/one-off test instances).
- We will make it so our browser tests are more accurate cross testing and production environments.
To see a real life example of what it looks like to deploy code on the WMF server cluster, watch this screencast created by Bryan Davis. That shows you what the person deploying the code sees when doing a localization (translations) update. A deployment that includes new changes to the code (e.g. MediaWiki and extensions) on the servers would be different.
The suite of tools that make up the current MediaWiki deployment tooling is continuing to be updated and rewritten in Python. You can see the work of this in the repository's history.
The updated Development and Deployment Process flowchart is now created using Blockdiag, a Python library for converting text into flow charts. You can see the current draft in the newly-minted Release Engineering repository.
There is now a matrix showing the requirements for deployment tooling for 3 projects (MediaWiki, Parsoid (and related), and ElasticSearch (and related)). This is not a fixed document and will grow/change as more is learned.Security auditing and response
Quality assurance
editNik Everett made the CirrusSearch browsertests runnable on a labs instance which has elastic search. The job is now triggered from Gerrit and being improved.
The experimental Meetbot instance setup by Antoine back in November has been overhauled and is now maintained by the community in the tools-labs project (thank you Tim Landscheidt).
Several Debian packages are now build automatically via Jenkins thanks to an effort by Carl Fürstenberg https://integration.wikimedia.org/ci/view/Ops-DebGlue/ . It helped packaging Parsoid among others.Quality Assurance/Browser testing
Multimedia
editIn February, the multimedia team continued to focus on Media Viewer v0.2, getting it ready for a wider release next quarter. Gilles Dubuc, Mark Holmquist, Gergő Tisza and Aaron Arcos released a variety of new features, such as: permissions, file usage, pre-loading of images, previews during load and an improved full-screen experience. We also started development on a better 'Use this file' panel, including share, embed and download features. Pau Giner designed this panel, as well as a new Zoom feature for a future version v0.3 of Media Viewer. We invite you to test the latest version (see the testing tips) and share your feedback.
Fabrice Florin managed product development for Media Viewer and prepared the release plan for a gradual deployment of Media Viewer out of beta in coming months, based on the team's latest development goals. We also hosted an IRC chat to discuss Media Viewer with the rest of the community and plan our next steps together. Lastly, the video RfC we started last month was closed with a community recommendation to not support the proprietary MP4 video format on our sites; as a result, we will only support open video formats like WebM and Ogg in the next version (v0.3) of Media Viewer. For more updates, we invite you to join the multimedia mailing list.Project management tools/Review
- Compacting interlanguage links
- MediaWiki Homepage Redesign
- Complete the MediaWiki API development course on Codecademy
- Clean up Parsoid round-trip testing UI
- Clean up tracing/debugging/logging inside Parsoid
- UploadWizard: OSM Embedding
Getting Facebook Open Academy projects up to speed is becoming even more complex than expected, but we are getting there slowly. All students and mentors met at the kick-off hackathon at Facebook headquarters on February 7−9 (see Marc-André Pelletier's report).
Wikimedia applied to Google Summer of Code 2014 and we were accepted. We also confirmed our participation in FOSS Outreach Program for Women round 8. We are organizing both programs simultaneously under a common umbrella, as we did last year with great success.Volunteer coordination and outreach
- Resolved on No sampled-1000 tsv file for 2014-02-06 on stat1002;
- Wikipedia Zero team investigated ~30% increase of number of lines zero tsvs between 20140218 and 20140220 file;
- Wikipedia Zero team investigated on light drop in zero requests around 2014-02-08;
- Data for ULSFO Cache performance prepared for Ops blog post.
This month, we welcomed Leila Zia as the newest addition to the team. Leila joins the Foundation as a research scientist after completing a PhD in management science and engineering at Stanford University. Her work will initially focus on modeling editor lifecycles to better understand what affects their survival and retention.
We hosted the first public Research and Data showcase, a monthly showcase of research conducted by the team and other researchers in the organization. This month, we presented two studies on Wikipedia article creation trends and on the measurement of mobile browsing sessions. The showcase is hosted at the Wikimedia Foundation and live streamed on YouTube every 3rd Wednesday of the month at 11.30am Pacific Time.
We attended the 17th ACM Conference on Computer-supported cooperative work and Social Computing (CSCW '14) in Baltimore. Research on Wikipedia and wiki-based collaboration has been a major focus of CSCW in the past, and this year three Wikipedia research papers were presented. We hosted a session to discuss collaboration opportunities for researchers interested in tackling problems of strategic importance for Wikimedia (a detailed CSCW '14 report will follow on wiki-research-l).
We started creating public documentation for data sources and tools used by the team for research and data analysis and porting docs previously hosted on internal wikis (for example: analytics/geolocation).
We continued to provide ad-hoc support to various teams at the Foundation and worked closely with the Growth and Mobile teams to prepare and review results for their respective quarterly reviews.The Kiwix project is funded and executed by Wikimedia CH.
- For the first time, we have released a ZIM file of the entire Wikipedia in English with all encyclopedic articles and thumbnails (download the 90GB file via torrent). In our announcement, we've also explained how we generate those archives and advertised the tools we've been working with, like mwoffliner and zimwriterfs. This month, a student also worked on the creation of ZIM files containing TED talks. The internship is now over and was a success; ZIM files will be published soon. Preparation work for our Usability Hackathon has started.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- Wikisource now has access to the the data in Wikidata like ISBNs and the date of birth of an author. The Lua interface for Wikidata has been extended significantly to make it more powerful and easier to use. Support for article badges has seen more work and is now missing mostly the user interface part. Loading time of items on Wikidata has been improved drastically. Everyone is asked to provide input for the upcoming redesign of Wikidata's user interface.
Future
edit- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.