Vicaya Local Search Http Server

For the past few months I’ve been playing with index and search optimizations (and approximations) in order to search a local corpus quickly (without running out of memory) from a typical web browser. I’ve learned quite a lot about search algorithms, but most significantly: They’re really complex.

I had come up with a number of solutions for Javascript and had moved on to indexing. Instead of recreating the wheel, I intended to use an existing indexer, such as Nutch based on Lucene. I would then modify the index utilizing Luke.

However, I’ve found Nutch to be such a nice piece of software, that I hoped to benefit directly from its innovations. I am already convinced that Lucene is the best open source search framework. Nutch, primarily written by the founder of Lucene, is a web crawler and searching application that competes (at least theoretically) with the likes of Google, Yahoo, and Excite.

So, me starts rethinking this client-side search concept. I grabbed a copy of Jetty, read up on Nutch, dropped in a translation of the Dhammapada, and produced a double-clickable searchable web server.

Ya’ll are welcome to download and test it out. You may need to get the Java VM (perhaps you can test with 1.5). I have done nothing fancy, but I’d like to know how widely usable it is.

Just a word of warning: Running the web server may allow others to connect to your machine under certain conditions — as if clever hackers couldn’t anyway.