Svemir Brkic
Archives
|
---|
Sphinx on ProyectoFedora.org
editSvemir, I implemented Sphinx on ProyectoFedora.org/wiki which is the spanish Fedora Project wiki. I've been modifying the way it displays the search page and results, and wanted to share these back with you. You can look at the changes here: http://proyectofedora.org/wiki/Especial:SphinxSearch
Thanks for your work! --Nushio 03:58, 8 July 2009 (UTC)
- Cool, thanks! Feel free to add a link to your project on the extension pages, or I can do it later. Are there some changes or improvements you would like to contribute to the project? Svemir Brkic 12:50, 8 July 2009 (UTC)
- Like I said, I made some visual changes, mainly to SphinxSearch_body.php to display the namespaces as in Wikipedia's search (Based on user feedback). I'm also starting to tweak the results page to display more relevant stuff, and moving things around, like formatting.
I'm still a n00b when it comes to hacking wikipedia extensions(and php, for that matter), so most of the code is pretty "dirty", but functional. But of course I'd love to contribute back to the project! --Nushio 22:15, 8 July 2009 (UTC)
Reshuffle the SphinxSearch documentation
editThe talk page does contain some useful information which are independent of any SphinxSearch release, so it might be a good idea to strip some of those topics, group them and create a subpage. Topics that seems relevant:
- Extension_talk:SphinxSearch#More_Windows_Install_Issues -- > Extension:SphinxSearch/Windows install
- Extension_talk:SphinxSearch#SQLite_Configuration --> Extension:SphinxSearch/SQLite configuration
- Extension_talk:SphinxSearch#Search_suggestions --> Extension:SphinxSearch/Search suggestions
- Extension:SphinxSearch/Search weight configuration
- maybe a page about categories and namespaces merged into a subpage Extension_talk:SphinxSearch#Category_filter, Extension_talk:SphinxSearch#Sorting_by_namespace
People always wonder about the differences of MW's search solutions, so on the main the page, a comparison matrix could describe those features in comparison (syntax support, performance, signle unique feature etc.).
- Standard MW Search
- Extension:EzMwLucene
- Extension:Lucene-search
- Extension:Zend_Search_Lucene_for_MediaWiki
Priority on the ToDo list would help people to see what comes next, I mean SPH_SORT_EXTENDED mode by @relevance and by number of times the page would be a killer and single out any feature against the other search engines available. PDF indexing could an item too as proposed by [1]
Sphinx, text excerpts, xmlpipe2 and pdf/djvu indexing
editWe found a work-around and made a proof-of-concept that allows to convert pdf/dvju files into a text otuput which than can be transformed into sphinx xmlpipe2 type like xml. Indexing through xmlpipe2 to generated content, merging with the main index all goes quite smooth, the only issue is that while searching for a term SphinxSearch would point to the correct NS_IMAGE page ID in MediaWiki (and display the correct file) but of course since no content information is stored in MediaWiki to this file, search result can't display any text excerpts.
What we thought was to try to store those related text information in another database table independent from MW but with the necessary field identifier. The question now is where in SphinxMWSearch could we have a chance to re-read missing text excerpt so that the result display would behave as if the text would come directly from MediaWiki's text table.
We found a database select in line 344 (SphinxMWSearch.php, 0.8.5) which only fetches information from the page table but we could not pinpoint the location for when/where the text excerpts are actually fetched. Help would be much appreciated --MWJames 09:15, 23 December 2011 (UTC)
- That happens in the core SearchResult class, method initText in include/search/SearchEngine.php Svemir Brkic 21:50, 23 December 2011 (UTC)
Category filter
editHello, I have installed the SphinxSearch extension. Can you tell me how to make a category filter like the one that is used in New World Encyclopedia? Thank you. --GnuDoyng 00:47, 27 December 2011 (UTC)
- That functionality has been deprecated. Current (SVN trunk) version of the extension supports "intitle:", "incategory:", "prefix:", and other advanced Wikipedia search techniques described here and it also supports extended sphinx search syntax. --Svemir Brkic 15:09, 27 December 2011 (UTC)
A barnstar for you!
editThe Technical Barnstar | |
SphinxSearch rocks! SmartK (talk) 12:24, 16 August 2012 (UTC) |
- Could you look into this? http://www.mediawiki.org/wiki/Extension_talk:SphinxSearch#Enable_search_box_suggestions-as-you-type_that_match_words_or_phrases_anywhere_in_the_page_title_16508
Enchant
editHow do you know enchant works? Maybe I'm taking the wrong conclusions. But I thought that once you have a dictonary, there will be suggestions when you search for a misspelled word. For example we have "dossier" like a hundred times in our wiki, when I search for "dosseir" I would think there would be a "did you mean"? Any help or some guiding to what to research is greatly appreciated. Dries (talk)
- It may take me a few days to get back to you on this - I do not have it setup currently. Is the word in the dictionary created by the sphinx script? It is a plain-text file so you should be able to see it. Svemir Brkic (talk) 17:45, 31 October 2012 (UTC)
- Hi, thanks already! Yes the word "dossier" is clearly in the dictionary. Are there perhaps any additional settings you have to make (localsettings...). Is it possible other extensions 'steal' that functionality? Any hooks I have to look for? Dries (talk) 22:50, 31 October 2012 (UTC)
- I use soundex for now because that seems to work, not very well, but for now it does the job better. Would like to use enchant though because I guess the results will be better. Dries (talk) 12:20, 3 November 2012 (UTC)
- Hi, we found the problem. The necessary packages for enchant weren't installed although confirmed as installed earlier. Thanks for all your efforts for mediawiki!
Just giving thanks
editHi, just stopping by to say thank you. I've got Sphinx running on Yellpedia.com. We're a massive new wiki that just went into public beta. Via a special script we wrote we uploaded the entire United States Yellow Pages into a wiki and thanks to your extension the searching of all those pages (ten million plus) is a breeze.