Hi,
I am trying to run the 'build' step from the instructions on a full dump of the English Wikipedia site.
I find that it runs at a reasonable rate until what appears to be a spell-correction step. This starts at ~50,000 terms/second, but slows down, and I killed it at ~600 terms/second after about a week, and only about half way through at "mo...".
Are there configuration settings I should be changing to run the build step against such a big corpus?
Thanks, Barry