Topic on Help talk:CirrusSearch

Beland (talkcontribs)

Since this search engine replaced Lucene on the English Wikipedia, I've found that it's considerably harder to find what I'm looking for if I'm typing in related terms and sort of stabbing in the dark to find a topic that I suspect is covered but I don't have the exact words in the title. I much more frequently now have to resort to an external search engine like DuckDuckGo or Google to find what I am looking for. I work for a search engine myself, so I know relevance can be a fuzzy concept. My recommendation would be to get a reasonably sized test set of 100 or 1000 search terms, and manually find good matching articles. Then you have a concrete metric that allows you to determine if you are making things better or worse as you tweak the ranking algorithm. I fear many thousands of people are failing to find information of interest on Wikipedia because of this problem.

Nemo bis (talkcontribs)

Indeed it's been a while since last such study we know of.

Beland (talkcontribs)

Here's an example:

I typed "Salar dy Uyuni" when I meant to type "Salar de Uyuni". CirrusSearch found two instances of "Salar de Uyuni" in article text, but it did not find the article "Salar de Uyuni", which is what I was looking for. Even more bizarre, if I search on "Salar de Uyuni", I find over 20 text matches, which seem just as legitimate as the two that show up with the misspelling.

Reply to "Relevance"