Discovery/Status updates/2016-09-05

This is the weekly update for the week starting 2016-09-05


  • Trey completed the analysis for optimizing language identification for the Dutch Wikipedia (nlwiki). The results were good (F0.5 = 82.3%) but not great. The small proportions of queries in the Romance languages and in German led to many more false positives than true positives and so they had to be excluded. Future work on improving confidence may help.
    • We could use help translating (via translatewiki) the relevant "showing results from" messages into Dutch. We'll need English, Chinese, Arabic, Korean, Greek, Hebrew, Japanese, and Russian translations.
  • Analysis team had a discussion on how to use better wording for phrases like "users were 1.07 times more likely to do X" and decided on using phrases similar to "we can expect 2-9 more sessions to click on a search result when they have the new feature"
  • Search team wrapped up researching the ElasticSearch instabilities on the eqiad search cluster on Aug 6, 2016; nothing conclusive was found.

Events and News

Other Noteworthy Stuff

  • Our elasticsearch clusters now have "row aware shard allocation". This means that we can theoretically lose one row of servers in our datacenter and still serve search traffic.
  • Search team sent out a request for comment article that was posted to various Village Pumps asking for it to be translated.
    • This was in reference to the cross-wiki search results new functionality and design articles on MediaWiki.

Did you know?

  • A study came out yesterday showing that giraffes are actually four distinct species, rather than one. (Original article and BBC report.) Of course, the English and German Wikipedia pages on giraffes[1][2] have already been updated!