pdftohtml encoding question[SOLVED]

Tim ignored_mailbox at yahoo.com.au
Wed Mar 12 11:07:57 UTC 2008


On Wed, 2008-03-12 at 08:47 +0100, François Patte wrote:
> The problem was solved when I looked to the html file produced: this
> line was missing
> 
> ~    <meta http-equiv="Content-Type" content="text/html;
> charset=utf-8">
> 
> though the pdf file was produced from latex with utf8 encoding.
> 
> One mystery remains: why the default encoding for navigator (firefox),
> or openoffice, is latin1? 

The default encoding for web browsing and serving, according to the HTTP
specifications is iso-8859-1, anything different needs explicitly
stating otherwise.  The meta statement is one way to do that, and about
the only choice you have if you open a file directly, rather than web
serve it.  If it is served, then the HTTP headers about content type
overrule anything typed into the file itself (the meta statement is to
be ignored).  If you set your browser's default to something other than
iso-8859-1 you'll have problems with rendering pages that are served
correctly (i.e. iso-8859-1 written pages without a specific content type
description about it), and that's an awful lot of web pages.

-- 
(This computer runs FC7, my others run FC4, FC5 & FC6, in case that's
 important to the thread.)

Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.




More information about the fedora-list mailing list