[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Desktop search tool using lucene



Mike MacCana wrote:

.
They (meaning engineers at redhat) are discussing this. The solution
won't use Lucene, as Lucene treats all fine content as equal - ie, it
doesn't know about headings being different from body text and so on.

Mike


Also, Lucene suffers from the Java UCS-16 scandal: they chose a character encoding which is good for Japanese, but bulks up european languages by a factor of two and doesn't support enough characters to do a good job with Chinese.

Because of this, Lucene loses a factor of two in performance compared to C++ competitors such as Xapian, which is a minus for those who care about performance on computers that aren't monster servers with 8 megs of RAM and Ultra 320 disks. (Funny enough, we're not all that happy with Lucene performance on such a machine... But we've got a lot of text...)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]