[katello-devel] search over rest api - interface design

Lukas Zapletal lzap at redhat.com
Thu Jul 14 13:07:24 UTC 2011


On 07/14/2011 12:01 PM, Amos Benari wrote:
>
> I hope this is not too long mail, and waiting for comments on the interface suggestion.

Well I am surprised that scoped_search is not a fulltext. As a long-time 
Apache Lucene user I have to recommend this project. It is de-facto 
standard in fulltext searching and it has been ported to many language 
(including Ruby I guess).

Main advantages of fulltext:

- speed (always faster than RDBM)
- comes with very rich query language
  - what you describe here is already implemented in Lucene
- works Like Google (TM)
  - scoring
  - calculate relevancy
  - bonus scores
  - fuzzy search (correcting typos...)
- it works with special structures like ids, dates, package names, 
versions - developer just need to implement "tokenizers"
- good scaling
- Spacewalk already use Apache Lucene, so it works under our scenario :-)

Very rough overview how I would implement it:

- collector component
  - periodically checks database (or backend systems - Pulp etc.)
  - downloads new/changed/deleted data
  - updates index database

- search component
  - provides searching capabilities
  - easy to implement

Maybe a fulltext engine could be the answer in this case. The only 
drawback is it builds its own data files, but its just some files on the 
disc. And it takes some time to index new data. It does not hurt much.

-- 
Later,

  Lukas Zapletal | E32E400A
  RHN Satellite Engineering
  Red Hat Czech s.r.o. Brno




More information about the katello-devel mailing list