Desktop search tool using lucene
Paul A Houle
ph18 at cornell.edu
Tue Jun 28 14:04:53 UTC 2005
Mike MacCana wrote:
>>.
>>They (meaning engineers at redhat) are discussing this. The solution
>>won't use Lucene, as Lucene treats all fine content as equal - ie, it
>>doesn't know about headings being different from body text and so on.
>>
>>Mike
>>
>>
Also, Lucene suffers from the Java UCS-16 scandal: they chose a
character encoding which is good for Japanese, but bulks up european
languages by a factor of two and doesn't support enough characters to do
a good job with Chinese.
Because of this, Lucene loses a factor of two in performance
compared to C++ competitors such as Xapian, which is a minus for those
who care about performance on computers that aren't monster servers with
8 megs of RAM and Ultra 320 disks. (Funny enough, we're not all that
happy with Lucene performance on such a machine... But we've got a lot
of text...)
More information about the fedora-devel-list
mailing list