[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: Anyone know of a tasteful LGPL HTML parser in C?
- From: Daniel Veillard <veillard redhat com>
- To: Development discussions related to Fedora Core <fedora-devel-list redhat com>
- Cc: RPM internals development and distro coordination <rpm-devel lists dulug duke edu>
- Subject: Re: Anyone know of a tasteful LGPL HTML parser in C?
- Date: Wed, 24 Nov 2004 13:22:49 -0500
On Wed, Nov 24, 2004 at 12:33:58PM -0500, Jeff Johnson wrote:
> I'd like to attempt to support
> rpm -qp http://download.fedora.redhat.com/.../*.rpm
> within rpm by applying fnmatch(3) against parsed HTML hrefs.
>
> So I'm questing existing HTML parser imp[ementations before hacking up
> something myself.
libxml2 HTML parser
> The constraints on my rpm problem/implementation space are:
> a) must be LGPL
MIT
> b) must be in C.
yes
> c) must be reasonably small and reliable.
if you link against the shared lib and use demand paging it's not too
big, otherwise it won't fit
> d) should work on a significant variety of HTML dialects without problem.
people have been using it to build commercial grade Web indexing software
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]