Discovery/Status updates/2019-03-25
This is the weekly update for the week starting 2019-03-25
Discussions
editSearch
edit- ElasticSearch upgrade to v6:
- incident report
- Trey finished a deep dive [1] into the performance of language identification for cross-wiki searching (example [2]) and punctuation-related problems, and discovered things are working pretty well overall, but the Chinese language model is a bit off.
- Erik noticed that the inlabel / incaption keywords should highlight the label/caption but were not [3]
- David worked on fixing an error code that Elasticsearch 6 nested_path and nested_filter are deprecated [4] and _retry_on_conflict was deprecated [5]
- We worked on migrating mjolnir to stdout/syslog/cee logging output [6]
- The team worked on upgrade to elasticsearch 6.5.4 for cirrus / codfw (specifically) [7] and for eqiad [8]
- Erik worked on the implementation and testing of glent m0 integration with wmf infrastructure [9]
- David did a lot of work to update the mw-config to use the psi&omega elastic clusters [10]
- David found that the auto_generate_phrase_queries is deprecated and ineffective [11]
- The team fixed an old bug where we were getting fatal errors - "cannot perform this operation with arrays" from CirrusSearch/ElasticaWrite (using JobQueueDB) [12]
- Gehel worked to make spicerack more robust when unfreezing writes to elasticsearch / cirrus [13] as well as creating a cookbook to reset frozen write state on elasticsearch / cirrus [14]
- Stas moved WikibaseLexeme search code to WikibaseLexemeCirrusSearch extension [15]
- We noticed that Elasticsearch indices went read-only, causing a huge lag [16]
- We also saw where search exceptions handling was printing response information on the screen [17]
- The team fixed an issue where mwgrep was not working [18]
- We also fixed an issue where Elasticsearch 6 needed to silence deprecation warnings to avoid logspam [19]
- We needed to create an extra elasticsearch clusters in the beta cluster [20]
- We also needed some alerts so we know if mjolnir starts misbehaving [21]
- We also converted check_elasticsearch.py icinga plugin to py3 [22]
- We needed to start using local nginx reverse proxy for connections reuse [23]
- The version of curator that we currently use (5.2.0) isn't compatible with elasticsearch 6. Which causes issues in a few cron on logtash servers (see blelow). Version 5.6.0 supports both elasticsearch 5 and 6.....so...we updated it [24]
- We also did some cleanup of the reprepro configuration for elasticsearch-curator [25]
- Getting a centralized way to inspect the content of the search profiles might be helpful when investigating search behaviors. In the same vein as other dump debug APIs (mapping/settings/cirrusdoc) David suggested that we should add a new simple API to dump the profiles (cirrus-profiles-dump) [26]
- David also found that a call to a member function toArray() on a non-object (null) in vendor/ruflin/elastica/lib/Elastica/Client.php:736 and fixed it [27]
--
- View all open tickets related to Discovery.
- Looking to get involved? See tasks marked as Easy or volunteer needed