Wikimedia Discovery/Meetings/Search retrospective 2017-03-01

The Retrospective Prime Directive: “Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.” — Norm Kerth

Action items from previous retroEdit

What has happened (since January 25)?Edit

(This list was initially populated from Discovery/Status updates).

  • Features
    • Work continues to upgrade to Elasticsearch 5 (T155671, T151224)
    • A lot of work on TextCat (language identification) has been deployed
    • Added document content model into the search index (T156371) and contentmodel: keyword.
    • Added more aliases for filetype: keyword (task T156413).
    • Inter-wiki search is here! An A/B test of a new search results page which includes inter-wiki search results was deployed to Persian, Italian, Catalan and Polish Wikipedias. Take a look and give us feedback! (T149806)
      • The inter-wiki search test has been modified this week to allow for more bucketing/sampling of users so that we can get more data to analyze. 
      • Test started on Feb 9, updated on Feb 14 and ran until Feb 21, 2017; analysis is in progress
    • TextCat (language ID) improvements for German, English, Spanish, French, Italian, Japanese, Portuguese, and Russian Wikipedias have been deployed. (T149324) In general, more languages are available to be detected, and detection accuracy is higher.
    • TextCat has also been enabled on the Dutch Wikipedia. (T142140)
    • Made code for wikidata search phase 1 (prefix/completion search)
    • WDQS got new servers, data reloaded, new version deployed, POST enabled, timeout raised to 1 min, some bugs fixed.
  • Tech debt/bugs/minor improvements
    • Fixed/updated where depreciated code wasn't being logged anymore (check logstash for `channel:deprecated`)
    • Performed extensive refactoring of Special:Search code to make it significantly easier to understand and build upon. (T150217) (technical debt)
    • Fixed a timeout issue with advanced searches (T152895, T134157) (not yet deployed, will be deployed with Elasticsearch 5 upgrade)
    • Fixed issue with ICU folding that caused problems with the search index (T156234)
    • A warning is now displayed if a user runs an advanced search query which only returns partial results due to a query timeout. (T149142)
  • Documentation
    • there's now a page on Testing Search.
  • Analysis
    • Created a list of languages for which we want to investigate analysers (T155549)
    • After analysis, decided to use Stempel as our new Polish language analyser (T154516); analysis of Stempel is underway (T154517)
    • Our initial analysis of the Stempel Polish analyzer is done. Overall it works well, but has some rare but bizarre stemming bugs. There’s a live demo with the Polish Wikipedia index in labs. Feedback on the Phab ticket (T154517) or talk page of the write up is very welcome.
  • Ops
    • Our codfw elasticsearch cluster has been upgraded to Debian Jessie, its partitioning has been standardised and a 12 new servers have been added (phab:T151326, phab:T151328, phab:T154251)
    • Wikidata Query Service has two more brand new servers (phab:T152643, phab:T152643)
    • Fixed issues with portal caching bad 404 error page (T158782)

Format: "Tiny Retrospective"Edit


  • 5 ***** Do people want more feedback on their work/way
    • yes!
  • People in the collab space for stand-ups should make sure they are on screen when speaking.
    • that kind of goes for everyone :)


  • do we need a fully documented checklist on how to update the portal?
  • 2 ** automate our maven deployment / release process (it seems that this is a pain point for Stas / David and it looks to me that we could easily improve it)
    • Not sure about easily (is anything with Maven? ;) but yes, worth looking into it. 
    • Effort should probably be timeboxed, either it is easy enough or it isn't worth spending time...
  • 3 ***Find a way to fluently enable new search keywords that need reindex/new data, it's causing some confusions (e.g. the new contentmodel keyword)


  • Running A/B tests is a very slow feedback loop (perhaps not that tiny...)
  • 1 *Getting feedback on things that are still internal, before releasing to the world, is working fairly well - could it be improved?
    • e.g. sister search; wiktionary gadget
    • How to get feedback from people who are aware of what we're doing, before going to e.g. wikitech
    • Nothing is internal. :)
  • 3 ***Weekly sprint planning is sometimes focused on how quarterly goals are proceeding, sometimes on whether everyone has enough work, sometimes on triaging the work board. We also sometimes chat about specific projects and tasks at length because everyone is conveniently present. All of those seem like good things to do. Should we plan to regularly do them all?
  • Does Cross-Wiki work still need its own column on the Search workboard?
  • 5 *****We sometime focus too much on having efficient meetings
    • Some of what came up in work-centric unmeeting probably should have come up in a work meeting context
    • Sometimes focusing on quick and efficient meetings might mean that we aren't having conversations we should be having
  • 3 *** Product (user?) Testing: I'd like to see if there's ways we could expand our testing "toolbelt". We rely heavily on A/B tests, but there are testing methodologies that other teams use (ex:, community workshops, beta features etc) that might be worth exploring. 
  • Tiny retrospecting is hard.+1+1+1


  • 5 ***** Do we need to foster more knowledge sharing between areas of expertise?
    • There have been some concerns that individual contributors have too much unique expertise
    • Could do joint problem-solving, occasional pair programming, etc. 
  • 1 * How to bring in outside people that want to help: contractor for ElasticSearch; API Fortress, Citolytics
    • Converting interest into a contract or other formal agreement
    • Our relationship with blazegraph might be a model we could use
  • How's that liaison guy doing? This quarter has been a little light on community discussion compared to past. Do folks feel supported?
    • Supported? Yes! Does that liaison guy need more to do? No.


  • 5 ***** Do people want more feedback on their work/way
    • Erika has been hearing that people want more feedback
    • Thinking about how we can create a feedback mechanism where we support each other to be more effective at work and professional growth
    • Not sure this is a tiny thing
    • Being remote makes it more difficult to know what I'm doing right, where I could improve
    • Considering something like a "kudos box": async comments to each other (p2p or anonymized)
    • If you have suggestions of ways (e.g. that have worked in the past), please get them to Erika
    • Is this general, or specific like code quality? (Could be all of that...full range. Workstyle, work quality, behaviors.)
    • People often don't feel safe addressing these issues unless there is a safe space. Compassionately help each other grow. 
    • International team, so not all have English as a native language. Concern that statements could be taken as offensive [makes it harder to give feedback]. 
    • [As a native English speaker] I cut non-native speakers a lot of slack
    • Some emails (from people I don't know) sound harsh due to wording; later I found out that wasn't the intent. Easier with people we know well. 
    • Interested to hear more about the kudos box. Sounds like a positive feedback generator; also want non-positive (balanced) feedback. 
    • Thinking about a modified kudobox; experimenting with mixing appreciation with constructive feedback (AGF/trustbuilding). Don't want to force saying something positive in order to be constructive
  • 5 ***** Do we need to foster more knowledge sharing between areas of expertise?
    • What if someone leaves for any reason (lottery number)? Experienced recently with Yuri leaving maps. 
    • Not about mastering complete expertise
    • Search team Wednesday technical meetings have been very effective at sharing understanding. Works within a team; maybe not across teams; maybe not for some other teams
    • Analysis team, for example, has an Employee Operations Manual ( ) that was put together during Oliver's departure and documents almost everything
    • Documentation is always a first step. 
    • Teams at the foundation tend to be rigidly structured--someone can be on a team for years without working on other stuff
  • 5 *****We sometime focus too much on having efficient meetings
    • Simple action would be that when the weekly checkin ends early, to open it up for any topics people want to discuss
      • Anyone is encouraged to add agenda topics to the doc at any time

Big ideas parking lot (captured here for future consideration)Edit

  • (none)

Feedback on Tiny RetrospectiveEdit

  • We ended up talking about the bigger issues, not the small ones

Action itemsEdit

  • Erika: Follow up on feedback mechanisms
  • Erika: Follow up on knowledge sharing
  • Kevin: Follow up on not having meetings too efficient
  • Kevin: Send list of little things around to everyone in email