User:Memeht/Improving the Wikimedia Performance Portal/Progress Reports

This page will house all Reports and links to blog posts, code samples created during the FOSS-OPW Internship.

I also maintain a blog (about my FOSS-OPW Project amongst other things) here

Community Bonding PeriodEdit

Community Bonding ReportEdit

1. How was your landing and your first meeting(s) with your mentors?

Unfortunately, since being selected as an FOSS-OPW Intern, I have not been in contact with my mentors as they have been in the middle of deploying high-impact Wikimedia features.

Due to the relatively short timeframe of the Internship, I focused on on-boarding from my end. This included conducting further research on Wikimedia Operating and Performance Goals, Wikimedia network architecture, Best Practices for Performance Management metrics (from high volume organizations like Google and New Relic), and Dashboard Design Fundamentals.

I created a Functional Specification draft,identified improvements to the dashboards displayed on GDash and completed an Introductory Tutorial to Grafana.

2. What is the way of working that you have agreed? (tools in use, communication channels, meetings…)

As previously mentioned, I have not been in contact with my mentors so this process has yet to be finalized.

In my proposal, I noted my work/learning style, and I am sure that after meeting with my mentors, we will be able to develop a working process.

3. Lessons learned since you applied for this OPW round and since you were accepted.
  • Fundamental use-case of Dashboards: As a tool to communicate insights, not necessarily for in-depth, on-the-spot analysis.
  • Different logging mechanisms used in Wikimedia's Platform.
  • Gained a better context of the metrics being displayed on GDash.
  • Need to document data flows in order to provide context for data.
  • Understood how Phabricator works
See Project Phabricator Page

Week 2 (Dec 16 - Dec 22)Edit

This week has been as research-heavy as the previous one. I have worked on retooling old models of Mediawiki's performance data and logging processes to help me better understand Visual Editor, and further explored <> for Analytics related to Visual Editor.

Since performance data is in a time-series format, I have also been taking a look at some basic statistical techniques for normalizing such data and identifying/accounting for seasonality within data spread. Interesting stuff!

I have also been in touch with my new/Interim mentor Quim, and decided on a weekly meeting schedule that fits both our schedule, which should help in the coming weeks when I get a chance to play with data.

I also worked on setting up Linn but it's been rough going.


Project PivotEdit

Project pivot to adding performance instrumentation to Parsoid
List of Phabricator Tasks

Week 3 (Dec 22 - Dec 29)Edit

  • Started work on Interim Mediawiki Project
    • Obtained an updated (Gerrit pull) copy of the Parsoid code base
    • Researched the HTML2wt & wt2HTML pipeline process
    • Reviewed current Visual Editor and Parsoid performance instrumentation (X-Parsoid Headers, Event Logging etc)
    • Ran some wt2HTML & HTML2wt tests on Commandline and via Web interface
    • Researched the Parsoid and Visual Editor pipeline
    • Researched Mediawiki performance instrumentation guidelines
    • Detailed out tasks for Project on Parsoid and made adjustments based on Parsoid Mentor's feedback
    • .js review
    • Determined communication plan with mentor (combination of IRC, Google Hangouts, email as needed)
    • Spoke with Mentor and other Parsoid team members on IRC to gain a better understanding of the Parsoid pipeline and its integration with Visual Editor
    • Weekly meeting with mentor/coordinator
    • Talked to Mediawiki Analytics Mailing List to understand SQL queries utilized in aggregate the Metrics being displayed on 'edit-reportcard-wmflabs' page
    • Spoke with Parsoid project Mentor about Project deliverables and set up some beginning Project Goals

Project DeliverablesEdit

  • With Parsoid Mentor (Subramanya Sastry | Subbu), came up with some Goals for Project;
    • Add instrumentation to HTML2wt pipeline using EventEmitters and StatsD/Event Logging
    • Metrics will include 'Time to DOM Diff" amongst others
    • Metrics should be displayed on a TBD front-end visualization interface
    • Possible deployment on or after the second week of January 2015

Week 4 (Dec 29 - Jan 5)Edit

  • Began discussion with Mediawiki Analytics community about possible Visualization front-end for Parsoid Metrics
  • Continued reading Parsoid codebase to determine which parts of the codebase Instrumentation should be added
  • Reviewed .js callbacks and asynchronous programming (Promises API)
  • Spoke with Parsoid Mentor to further understand codebase and steps in instrumentation process
  • Eliminated Event Logging as a metric aggregation candidate, leaving statsd as the default
  • Light research on statsd implementation, statsd clients and RestBase's implementation

Week 5 (Jan 5 - Jan 12)Edit

Week 6 (Jan 12 - Jan 19)Edit

Week 7 (Jan 19 - Jan 26)Edit

Week 8 (Jan 26 - Feb 2)Edit

  • Updated and re-submitted patch
  • Started working on adding server-side settings for Grafana, including reaching out to mailing lists etc
  • Meeting with Mentors

Week 9 (Feb 2 - Feb 9)Edit

  • Continued working on patch
  • Research on submitting a proposal for Open Source Bridge

Week 10 (Feb 9 - Feb 16)Edit

  • Resolved Racing condition problem in code
  • Updated txstatsd wrapper code with changes to rbUtil.js
  • Continued working on patch

Week 11 (Feb 16 - Feb 25)Edit

Week 12 (Feb 25 - Mar 4)Edit

  • Continued adding implementation to performance code
  • Spoke to several members of WMF community about dashboard best practices
  • Began Graphite/Grafana documentation

Final Days (Mar 4 - Mar end)Edit