Discovery/Status updates/2018-11-26

This is the weekly update for the week starting 2018-11-26



  • David worked on adding a config var to control which clusters (replica) the sanitizer works on by default, since by switching to multi-instance setup we will be adding temp clusters that the sanitizer doesn't need to run on [1] Note: this will go into production the week of December 11th, 2018
  • Erik repaired the transfer_to_es analytics job which had stopped working twice in a row [2]
  • Erik and David worked on the effort to remedy intermittent json parse failures in completion suggester [3] Note: this will go into production the week of December 11th, 2018
  • David fixed an issue where implemented LTR query features that rely on the feature vector were not compatible with the way Elastic implemented their profiling API [4]
  • Gehel, David, and Mathew worked on refactoring current code base to support multiple elasticsearch instances/multiple elasticsearch clusters [5]
  • We needed to deploy extra-analysis-surrogates & the experimental highlighter to production before we could reindex the Chinese-language wikis [6] and [7]
  • An issue was found where characters in CJK extension C were being treated as U+FFFD when searching on zhWP - it's fixed now [8]
  • The team helped to setup two elasticsearch clusters on relforge to test multi-instance [9]
  • The team also helped with the Elastica dependencies that needed to be updated to v5.3.2 [10]
  • David worked on fixing the prefix search that had broken again when using multiple namespaces and namespaces with $wgCapitalLinks = false; [11]
  • David and Gehel completed work on preparing a debian package with the experimental highlighter [12] and fixed an issue where the experimental highlighter was breaking unicode surrogate pairs when cutting the snippets [13]
  • David fixed an issue where a precondition failed: while trying to get token image at offset -1 [14]
  • Mathew worked on issues where parts of WDQS puppet module is written in old puppet and also lacks type constraints [15]
  • David fixed an issue where elasticsearch_hot_threads on relforge had errors (ImportError: No module named 'yaml') [16]
  • Mathew and Gehel refactored wdqs::gui - to separate cron tasks from the module [17]
  • Trey wrote a blog post about stemming, stop words, and thesauri. [18]


  • Jan reverted a change that caused the article count to show 0 for all wikis. [19]
  • Robin designed and Peter and Jan implemented banners for the big English fundraising campaign. [20]
  • Volker also fixed some bugs on the portal [21] [22]