[katello-devel] search over rest api - interface design
Lukas Zapletal
lzap at redhat.com
Thu Jul 14 13:07:24 UTC 2011
On 07/14/2011 12:01 PM, Amos Benari wrote:
>
> I hope this is not too long mail, and waiting for comments on the interface suggestion.
Well I am surprised that scoped_search is not a fulltext. As a long-time
Apache Lucene user I have to recommend this project. It is de-facto
standard in fulltext searching and it has been ported to many language
(including Ruby I guess).
Main advantages of fulltext:
- speed (always faster than RDBM)
- comes with very rich query language
- what you describe here is already implemented in Lucene
- works Like Google (TM)
- scoring
- calculate relevancy
- bonus scores
- fuzzy search (correcting typos...)
- it works with special structures like ids, dates, package names,
versions - developer just need to implement "tokenizers"
- good scaling
- Spacewalk already use Apache Lucene, so it works under our scenario :-)
Very rough overview how I would implement it:
- collector component
- periodically checks database (or backend systems - Pulp etc.)
- downloads new/changed/deleted data
- updates index database
- search component
- provides searching capabilities
- easy to implement
Maybe a fulltext engine could be the answer in this case. The only
drawback is it builds its own data files, but its just some files on the
disc. And it takes some time to index new data. It does not hurt much.
--
Later,
Lukas Zapletal | E32E400A
RHN Satellite Engineering
Red Hat Czech s.r.o. Brno
More information about the katello-devel
mailing list