Engineering metrics in January:
- 112 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits remained stable around 650.
- About 45 shell requests were processed.
- Wikimedia Labs now hosts 155 projects and 931 users; to date 1473 instances have been created.
Major news in January include:
Note: We're also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.
For a more complete and up-to-date list, check out the Project:Calendar.
Date
|
Type
|
Event
|
Contact
|
1 February 2013
|
|
Wikimedia workshop at Quark '13 (Zuarinagar, Goa, India)
|
|
2 February 2013
|
|
FOSDEM (Brussels, Belgium)
|
|
4 February 2013
|
|
QA: Triage/retest Article Feedback and Article Feedback 5 bug reports (tentative)
|
AKlapper, Valeriej
|
11 February 2013
|
|
QA: Article Feedback New Features
|
Fabrice Florin, Cmcmahon, Qgil
|
17 February 2013
|
|
GNUnify (Pune, Maharashtra, India)
|
|
19 February 2013
|
|
QA: Git/Gerrit Bug Triage
|
AKlapper, Valeriej
|
21 February 2013
|
|
Wikipedia Engineering Meetup: Wikimedia & wikiHow mobile updates (San Francisco, CA, USA)
|
Qgil
|
22 February 2013
|
|
Southern California Linux Expo (Los Angeles, USA)
|
|
25 February 2013
|
|
Mobile uploads to Wikimedia Commons
|
Michelle Grover, Qgil, Cmcmahon
|
26 February 2013
|
|
Presentation of BlueSpice at 6PM UTC (Join link)
|
Mitevam
|
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
Production Site Switchover
- The Wikimedia Foundation switched over its primary data center from Tampa, Florida to Ashburn, Virginia on January 22. Given the scale and complexity of the migration, we scheduled three 8-hour windows to perform the migration, but we were able to complete it on the first attempt. Because the switchover involved, among other things, moving over the master databases from Tampa to Ashburn, the site was set to 'read-only' mode for about 32 minutes. During that period, the site was available but no new contents were created, edited or uploaded. As expected, there was some minor fallout of the migration, mostly due to configuration changes, but they were quickly contained by the Engineering and Operation teams.
- With this migration, Tampa data center will now be our fail-over site and we plan to perform site fail-over tests every few months. There are remaining small non-core applications still using Tampa as the primary site, such as RT, etherpad and Bugzilla. They too will be migrated in the coming months.
Site infrastructure
- One of the main concerns of the migration was serving traffic from the new data center using empty memcached servers: the spike in load on the Apache and database servers could have been disastrous to the site. To address it, Tim Starling improved on the single instance implementation of 'Parser Cache' persistent store in Tampa (to 3 sharded instances), and Asher Feldman built and replicated the databases across the 2 data centers.
- Another improvement, done by Asher and Peter Youngmeister, was the implementation of MHA (Master High Availability) on our MySQL clusters. Its primary objective is to automate the promotion of a slave database in a master database fail-over scenario and to to reduce downtime, without suffering from replication integrity problems, without prolong database latency, and without changing existing deployments.
- Faidon Liambotis and Mark Bergsma continued to work on the Ceph file object store. With Domas Mituzas' help, they identified a performance issue with the RAID card which caused severe read/write latency on the Ceph cluster. Faidon has confirmed with the vendor that it is a known problem and no fix is available yet. We have ordered and substituted those RAID cards, and test results seem to indicate that the performance issue is solved.
Fundraising
- Fundraising bastion hosts were deployed in the Ashburn and Tampa data centers. We also tweaked and tuned central logging and monitoring, and converted the remaining fundraising MyISAM tables to InnoDB, which should fix dump-induced replication lag.
Data Dumps
- This month, we had a look at the process of using the XML dumps to create a local copy of a Wikimedia site: it turned out to be painful and cumbersome at best, and unfathomable for the end-user in the worst case. As part of an attempt to improve this situation, there is now a new experimental tool available for *nix platforms, for generating MySQL tables from the XML stub and page content files. It is intended to read input files from various versions of MediaWiki and generate output for the version the user wants. Testing and feedback is encouraged.
Wikimedia Labs
- In January, we had a number of performance and usability improvements. Three compute nodes were added into the pmtpa zone. Alex Monk added Echo notification support to labsconsole, passwordless sudo is now the default for projects, and shell requests are created automatically on account creation. The sysadmin and netadmin roles have been combined into a single projectadmin role. Glusterfs was upgraded to handle a memory leak, but unfortunately a new bug has been introduced that caused some instability in project storage. Work is ongoing to improve the project storage situation.
VisualEditor [edit]
In January, the team worked primarily on reviewing and cleaning-up the code
deployed in December. They spent time with their colleagues in the Parsoid team planning the next phase of development, which is aimed at making the VisualEditor the default editor for all Wikipedias from July 2013. The alpha version of the VisualEditor on mediawiki.org and the English Wikipedia was updated twice (
1.21-wmf7 and
-wmf8), fixing a number of bugs reported by the community and making some adjustments to the link inspector's functionality based on feedback.
Parsoid [edit]
In January, the
Parsoid team did some Spring cleaning and bug fixing. The serialization subsystem was overhauled: it now features simpler and more robust separator handling. Selective serialization was rewritten to deal with content deletions. It also features DOM diff-based change detection that does not rely on client-side change marking. Support for non-English wikis and local configurations was also improved a lot, and will likely stabilize in the next weeks.
The team also discussed and documented the longer-term Parsoid / MediaWiki strategy in the
Parsoid roadmap. The performance-oriented C++ port was deprioritized in favor of DOM-based performance improvements and HTML storage. The basic idea behind storing (close to) fully processed HTML is to speed things up by doing no significant parsing on page view at all. In the longer term, VisualEditor-only wikis can avoid a dependency on Parsoid by switching to HTML storage exclusively. Overall, the plan is to leverage the Parsoid-generated HTML/RDFa DOM format inside MediaWiki core to enable better performance and editing capabilities in the future.
Editor engagement features
edit
Echo (Notifications) [edit]
This month, we stepped up development on the Notifications project
Echo and updated our first experimental release on mediawiki.org. Ryan Kaldari and Benny Situ improved the user experience for core features such as the badge, fly-out, all-notifications page and email notifications, and started developing new features such as
bundling,
dismiss and
web preferences. Luke Welling completed work on HTML email and started development of a more robust
job queue. Fabrice Florin led discussions about the Echo product plan, and
new features and
notifications under consideration, while Vibha Bamba designed new components of the user experience. We plan to develop some of these features and notifications in coming weeks, and are aiming for a first release on the English Wikipedia by the end of March; in the meantime, you can
try the current version on mediawiki.org. We are also recruiting for a
software engineer to join our team and work with us on this and other editor engagement projects.
Flow Portal/Project information [edit]
Flow entered the product design phase in early January.
OPW intern Kim Schoonover began
user research regarding how user-to-user talk pages are handled, and collected data about the difficulties that new (and existing) users have when using them. Engineering discussions started about potential back-end and scaling difficulties, the possible use of Wikidata's
ContentHandler, and the evaluation of Wikia's MessageWall. A plan for community engagement was proposed and accepted, with a consultation about the problems faced planned for early February, with experienced and newer users alike.
Article feedback [edit]
In January, our team updated
Article Feedback v5 and discussed its release with communities in the English, French and German Wikipedias. Developer Matthias Mullie completed a major code refactoring, which is now being reviewed. He also developed a final set of
new features, such as simpler moderation tools and better filters, to be tested next month. Dario Taraborelli and Aaron Halfaker posted a
feedback evaluation report, which suggests that about 39% of the feedback collected in their study can be used to improve articles (see also
their other study results). Oliver Keyes responded to community questions in a
request for comments about future deployments on the English Wikipedia, with a final decision expected next month. Fabrice Florin led product planning and discussed a possible deployment on the
French Wikipedia and with the
German Wikipedia, currently evaluating the tool in
an ongoing pilot with a vote expected in May. Once our development is complete and communities reach their decisions for each project, we expect to release Article Feedback v5 on a range of Wikimedia sites in coming months.
Editor engagement experiments
edit
Editor engagement experiments [edit]
In January, the Editor Engagement Experiments team ("E3") planned its
goals for the quarter, which ends in March. We also made progress on the following projects which are included in that plan.
First up, we launched guided tours on the English Wikipedia, including a test tour to demonstrate the capabilities of the extension, and a tour associated with the "onboarding new Wikipedians" (aka GettingStarted) project. In addition to tours created by the team, the extension supports community-created tours. Note that unlike many other projects by the E3 team, guided tours are planned as a permanent addition to Wikipedia, with each tour implementation considered to be experimental. (For example: the "getting started" tour will be delivered via a split A/B test.)
While building guided tours, the team also A/B tested the Getting Started landing page and task list, measuring the effect it had on driving new contributions. Several rounds of analysis were completed and published on Meta (round 1, round 2), with the conclusion that the onboarding experience is leading to small but statistically significant increases in new English Wikipedians attempting to edit, as well as saving their first edit. In addition to measuring the effects of the guided tour associated with this project, immediate plans are to redesign the landing page and add additional task types, to entice more new contributors.
Work also continued on refining the reliability and precision of the data collected from
EventLogging. In particular, we migrated EventLogging to a dedicated database, and began collecting server-side events in addition to client-side, to support work such as measuring account creations on desktop and mobile. January also saw the heavy use of the new
User Metrics API, in order to complete cohort analysis of onboarding users and for metrics reported at the
Board presentation on the Foundation's year-to-date progress. Development of the API continues, and a public announcement is expected for early March. Last but not least, a call was put out for a part-time
Technical Writer to work on documenting both of these pieces of infrastructure.
2012 Wikimedia fundraiser [edit]
January marks the official end of the 2012 fundraiser. The team spent the entirety of the month cleaning up and recovering from the very successful months of November and December, auditing the donations, and writing tools that will help the team run continuous auditing in the future.
Language tools [edit]
Development of the new user interface for Translate, as well as the translation editor functionality, continued throughout the month of January. Focus was on back-end work and extending the WebAPI to support the remaining features which are needed to reach feature parity with current editor. The MediaWiki Language Extension Bundle 2013.01 was released. Universal Language Selector was deployed with limited features to a selection of Wikimedia sites projects using the Translate extension. Collaboration projects also continue with Red Hat's language technologies teams, with an upcoming work sprint to complete several projects extending internationalization support for Indic languages. Runa Bhattacharjee kicked off the Language coverage matrix, an attempt to compile a snapshot of our internationalization tools coverage per language for 300 languages.
Milkshake [edit]
More input methods were added to jQuery.IME, and bugs were fixed in jQuery.ULS.
GeoData Storage & API [edit]
After its soft launch in December, GeoData was
officially announced this month. Work on improvements and bug fixing continues. The Special:Nearby page, which has been deployed to an experimental version of the site, represents the first major use of this feature on mobile projects. We hope to use it to help contributors identify articles in need of photos.
Mobile QA [edit]
The push to get
MobileFrontend up and running on Beta Labs is well underway. We've also added
test cases for Wikipedia Zero and we are planning a community test event for Mobile Upload and Commons in February.
Mobile design/Uploads [edit]
This month, the mobile web team finished up work on the watchlist feature and kicked off a 3-month sprint on photo uploads. The focus in January was on developing basic uploading infrastructure: uploading images to Commons under a single Creative Commons license. We also built out the UX/UI design for a call to action on articles lacking images in the lead section. Through this workflow, users can upload an image to Commons and add a thumbnail of the image to the appropriate article on their local Wikipedia or sister project, in one simple step. We also developed a mobile uploads page where contributors can see their recent uploads and potentially donate more images from their mobile device to Commons. These features are currently live on the Beta mobile site and are set to be released to the full mobile site in February.
Apps/Commons [edit]
January marked the first month of the Apps team's existence. Yuvaraj Pandian has started work with Brion Vibber on iOS and Android-based apps to upload photos to Commons. Both platforms are being developed concurrently and will have feature parity. Shankar Narayan joined us and and will be supporting the team for all design needs. While the first iteration of the Commons App isn't scheduled to finish until February 8th, the team has already created two skeleton apps that can upload, share and show the user's contributions. The team will be spending their next iteration tweaking workflows and styling the app. We also released new versions of the Wikipedia app on iOS and Android in order to bring it into compliance for legal privacy/disclaimer issues.
Wikipedia Zero [edit]
During January, Wikimedia was
awarded a grant in the Knight News Challenge for our work in expanding Wikimedia mobile projects. Part of this grant will be used for Wikipedia Zero and the
SMS/USSD projects to improve access to knowledge in the developing world. In addition, we've
partnered with VimpelCom to provide Wikipedia Zero to at least 100 million additional customers this year.
MobileFrontend/J2ME app [edit]
During January, we've begun to explore ways to reduce the memory and processor requirements of our J2ME app, to increase the number of phones that can use this application.
Wikipedia over SMS & USSD [edit]
We are finishing work on capturing the metrics from the SMS server to learn usage numbers and determine how many sessions are completed.
MediaWiki 1.21/Roadmap [edit]
MediaWiki
1.21wmf7 and
1.21wmf8 were deployed in January on a modified schedule, due to holidays and because of the data center migration. Deployments have returned to their usual fortnightly schedule.
Git/Conversion [edit]
The
ExtensionDistributor was rewritten in early January. While this was primarily done to support the
data center migration, this was the first time ExtensionDistributor had received any signification attention since the migration to Git. The new version now utilizes the Github API to generate extension snapshots. We hope that the new version will be more reliable for users. SVN-based extensions are no longer supported, but this is not expected to impact many users since these extensions are largely unmaintained (all popular and active extensions have long since moved to Gerrit). As always, these extensions will remain in SVN should anyone still want the code.
Wikidata deployment [edit]
Sam Reed helped the Wikidata deployment, deploying the Wikibase Client extension to Wikipedia in Hungarian, Hebrew, and Italian. Chris Steipp reviewed the Wikidata team's work to extend AbuseFilter for use with structured data. Aaron Schulz worked with Daniel Kinzler on job queue improvements.
Wikivoyage migration [edit]
Wikivoyage
officially launched on January 15. Most of the Wikimedia Foundation's involvement was completed in November, but some minor bugfixing was done in support of the official launch.
Multimedia [edit]
Jan Gerber continues bugfixing and refining TimedMediaHandler, mainly focusing on operational improvements to make more efficient use of our server infrastructure. NFS for uploads/thumbnails has been unmounted from all Apache servers and the NFS back-end configuration was removed from MediaWiki; all files now only use Swift. A workaround has been added for the Swift back-end class when used with Ceph, so that temporary URLs can be used (for making video thumbnails for example). A Python script to copy files into Ceph has been run and is being worked on. Various issues have been reported in Ceph's bug tracker and are being looked at by the developers.
Lua scripting [edit]
Lua development was put on hold through the Ashburn data center migration. We've now resumed work on Lua, with Brad Jorsch and Tim Starling making more functions available in Lua that are currently already available in template parser functions.
Site performance and architecture [edit]
A patch to allow moving the DB job queue to another cluster is under review. An experimental redis-based job queue patch also exists in gerrit. Code was merged to support more complex data structures (lists, sets) in memcached (with atomic updates).
Admin tools development [edit]
The team mainly focused this month on improving the AbuseFilter extension, which is now working on the
Wikidata site after support was added for other content types (as defined using
ContentHandler). There was some significant work done on blocking abusive proxies and abuse limits, and some additional progress made on global AbuseFilters, user renaming and the interface for Stewards to
mass-lock user accounts.
Security auditing and response [edit]
QA [edit]
Beta cluster [edit]
The main use for the Beta Cluster in January was to test
git-deploy. Zeljko Filipin continues to run regular tests there. Antoine Musso, Max Semenik, and Andrew Bogott are setting up MobileFrontend to run on Beta for testing purposes.
Continuous integration [edit]
Antoine Musso worked with several MediaWiki extension authors to ensure that the unit tests for those extensions are run by Jenkins and that they work. He hopes to have all extensions that run on the Wikimedia production cluster fully operational by the end of February. Antoine also integrated
PHP CodeSniffer into our automated test runs.
QA/Browser testing [edit]
Analytics/Kraken [edit]
Analytics/Limn [edit]
Bug management [edit]
This month, a first
bugday was held, targeting bug reports which had not seen any changes for more than one year, resulting in about 30 tickets being updated. In addition, some cleanup work (decreasing the number of unprioritized bug reports and going through open reports in "ASSIGNED" status for more than a year) took place. Andre Klapper worked on
small Bugzilla code changes and published initial information on
Bugzilla usage per development team. Community members were invited to join the
MediaWiki Group Bug Squad. Furthermore, some problems due to
data center migration were investigated, and it was discussed how to improve interaction on Bugzilla tickets that need handling by the Operations team (who mostly prefers to use the
RT bugtracker instead).
Mentorship programs [edit]
Six
Outreach Program for Women interns started on January 3rd and will work full time until April.
Mariya is working on a
discussion among third-party MediaWiki users.
Valerie has completed the
Bug Squad group proposal and a first Bug Day.
Priyanka created a
script and plans to move to
Git.
Sucheta is on schedule following her
project plan.
Kim is learning about
Flow and the basics of interactive design as indicated by her mentor.
Teresa has completed
a solid base for her extension and is working on the main functionality. She hit a snag with her work environment this week, but is still on track with her proposed timeline. The
Google Summer of Code 2013 page was created, a
pre-planning discussion started on wikitech-l, and
LevelUp matchmaking for the first quarter of 2013 is nearly done.
Technical communications [edit]
Guillaume Paumier provided
communications support to the engineering team, notably around the
data center migration and associated
banners,
notices &
translations. He started to organize and clean up the MediaWiki version pages (like
MediaWiki 1.21/wmf7) to make them more useful for
tech ambassadors, by highlighting the most important changes, improving translatability and adding navigation. He also prepared and organized translations for the
How to report a bug and
How to contribute pages, to facilitate the involvement of volunteers who don't necessarily communicate in English. Last, he created a
Project:Calendar to consolidate and centralize announcements for all
events, to make opportunities for participation more visible. Events around a particular topic (like
QA, testing and bugs) can still be selectively transcluded, using
Labeled Section Transclusion.
Volunteer coordination and outreach [edit]
The
MediaWiki groups for
Promotion and
San Francisco were officially approved by the
Wikimedia Affiliations Committee, and are the first
Wikimedia User Groups created. We helped the
Editor Engagement team organize a sprint to
test Echo, but our plans to collaborate further with the Editor Engagement and
Mobile teams were delayed; Quim Gil proposed
a different approach combining regular, time-based
QA and
bug management activities, in the form of
QA weekly goals. Two such events (
non-Latin character testing in VisualEditor and
a review of old bugs) happened in January, and more are scheduled. Heavy work was done with Chris McMahon to improve the
top QA pages, although
some problems remain.
Template:MediaWiki News is now manually synced with
social media, bringing fresh updates to the
mediawiki.org homepage and
News page. Quim also took the lead on organizing the
Wikipedia Engineering Meetup on January 17th. He prepared an
intro to MediaWiki & Wikimedia tech contributions, which he tested at
FOSDEM, designed to be reused by other presenters. Last, we confirmed that technical projects are eligible to
Individual Engagement Grants.
The Kiwix project is funded and executed by Wikimedia CH.
- We have adapted the kiwix-plug script to Tonidoplug2, a device cheaper than the Dreamplug. Kiwix was elected by Sourceforge users as February's Project of the Month and an interview of Emmanuel Engelhart was published. For the first time, Kiwix has reached 100.000 downloads a month in January.
- Beside Kiwix, the openZIM website was revamped and simplified for better readability. The openZIM bug tracker and source code management were migrated to the Wikimedia infrastructure (Bugzilla and Git).
The Wikidata project is funded and executed by Wikimedia Deutschland.
- January has been an exciting month for Wikidata. The deployment on the first Wikipedia sites (Hungarian, Hebrew and Italian) was completed. At the same time, work has continued on the user interface and back-end for statements, the core part of Wikidata's second phase. This will enable users to enter information like the children of a given person or a link to their portrait on Wikimedia Commons. These features can already be tested on the demo system. We've also worked on making AbuseFilter work with Wikidata, and wrote a new mechanism to distribute changes to the clients (Wikipedia) so they can show Wikidata changes in their RecentChanges. We made progress on using Solr for search and rewrote the draft for the inclusion syntax to be much simpler. This is the syntax that editors will use to include data from Wikidata in Wikipedia. A manual for using Pywikipedia on Wikidata was written as well.
- If you want to code on Wikibase, the software powering Wikidata, have a look at the outstanding bugs and tasks.
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.