Discovery/Status updates/2018-06-04
This is the weekly update for the week starting 2018-06-04
Discussions
editSearch
edit- After lots of talk about stemmers getting committed and plugins getting deployed, the Slovak-language wikis have finally been *reindexed*, and stemming [1] is now happening on the Slovak wikis!
Search—Time Machine Edition
editA few things from May that got missed:
- Trey wrote up some potential applications of natural language processing (NLP) to on-wiki search [2]. We're still going through them to pick out a couple that we'll turn into projects, probably next quarter. Right now, spelling correction and entity extraction are high on the list, but more questions, comments, and suggestions are welcome.
- Erik pulled 90 days worth of regular expression (regex) searches across all wikis, and Trey did a quick survey of the most common patterns. [3] There are a lot more regex searches than we thought—5.6 million in 90 days!—and three apparently automated processes (bots, apps, or tools of some kind) are responsible for more than 90% of the regex searches.
--
- View all open tickets related to Discovery.
- Looking to get involved? See tasks marked as Easy or volunteer needed