[publican-list] Possible alternative to wkhtmltopdf

Jeff Fearn jfearn at redhat.com
Wed Jan 18 02:41:47 UTC 2012


Hi Peter, sorry for the looooooong delay in responding ... people just 
keep demanding my attention! :)

Is there any reason you are not applying this effort to webkit? AIUI 
both Apple and Google have teams making webkit use CSS3 properly, so it 
would seem to be handy to take advantage of their efforts.

wkhtmltopdf seems to be a better comparison than FOP since we are 
actively moving to it. I have wkhtmltopdf built and working on all our 
arches, like PPC32 and S390, so I'm very keen on having that as a 
benchmark for consideration ... which of course excludes FOP! :)

I tried checking out the git repos but got errors:

fatal: The remote end hung up unexpectedly

Cheers, Jeff.

On 12/06/2011 11:42 PM, Peter Moulder wrote:
> I mentioned earlier that I was working on an HTML renderer to do
> pagination.  Let's call it Morp.  Although it isn't user-ready, the
> output is starting to look like a tempting alternative, at least for
> print usage.
>
>
> Headline features from a Publican point of view:
>
>    - HTML/CSS styling.
>
>    - Doesn't fall apart when encountering a keep-together block larger
>      than a page.
>
>    - Allows glyph fallback font substitution for mixed-script documents.
>
>    - Proper shaping for Indic scripts (using Pango).
>
>    - Decent page breaking: honours 'widows'&  'orphans' and so on, but
>      also tries to avoid breaks that are merely undesirable, such as
>      breaking a short list item, or even splitting a paragraph if this
>      can be easily avoided.  Conversely, it might allow a widow if the
>      alternatives seem worse.
>
>      (E.g. if I mark figures as page-break-before:avoid and
>      page-break-inside:avoid, then Morp chooses to give a widow on page 89 of
>      the below sample in preference to either breaking those constraints or
>      leaving the page only 60% full.)
>
>    - css3-page styling of page headings, page numbering (roman numerals
>      in preface), styling of the "blank" page before a chapter, different
>      margins between inside&  outside edges, etc.
>
>    - Rounded borders for the<pre>  things.  (This is the most obvious
>      visual difference between FOP page content that I guess is due to
>      something missing from FOP.)
>
>    - Justified text good enough to actually use.
>
>      Web browsers and even word processors have taught people that
>      justified text can't be used satisfactorily, producing large gaps
>      and/or excessive hyphenation.
>
>      Morp may not apply every known technique, but already it's enough
>      that Publican-produced pages can look like a book rather than like
>      a web page or school project.
>
> (I have a feeling that FOP can do quite good justified text too, btw.)
>
>
> The most recent sample of wkhtmltopdf output that was posted to the list
> was the Red Hat Enterprise Linux 6 Installation Guide (in English):
>
>    http://fedorapeople.org/~jfearn/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-TEST.pdf
>
> The corresponding document (though apparently a slightly different
> version) as rendered by FOP is
>
>    http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/Installation_Guide/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US.pdf
>
> while output from Morp is at
>
>    http://bowman.infotech.monash.edu.au/~pmoulder/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US-Morp.pdf
>
>
> I've tried to make the page styling match FOP output.  Given that
> Publican output has lots of screenshots (so it's hard to fill every page
> evenly no matter what you do), I've set the pagination not to try very
> hard to fill pages exactly, letting break pages in more logical places
> (so usually breaking between paragraphs rather than within paragraphs).
>
> I've used SVG versions of the warning/note/important icons, whereas I
> replaced the list-item bitmap images with simple glyph markers (diamond
> and box).
>
> Some notable omissions are:
>
>    - No page references yet (e.g. in tables of contents).
>
>    - No clickable document outline or clickable links.  I'd do this if
>      Cairo made it convenient (someone was working on an interface for
>      that), but this isn't something my boss needs.  Otherwise, the
>      output could be labelled as "PDF for printing" or the like, and
>      steering people to EPUB or HTML for on-screen use.
>
> pjrm.
>
> _______________________________________________
> publican-list mailing list
> publican-list at redhat.com
> https://www.redhat.com/mailman/listinfo/publican-list
> Wiki: https://fedorahosted.org/publican


-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall




More information about the publican-list mailing list