[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Charset config of apache 2.x



On Wed, 2007-02-21 at 23:45 +0800, edwardspl ita org mo wrote:
> I just change the config of apache 2.x :
> # AddDefaultCharset UTF-8
> AddDefaultCharset Big5
> 
> But the result of display ( IE ) still utf-8...
> So, how to fix the problem ?

Just to be sure that isn't an MSIE stupidity, try the same thing with
lynx.  e.g. lynx --head http://www.example.com/your-test-page
(replacing that fake URI with one from your testing server).  You'll get
a page back with the headers that the server actually sent.

Here's one I just tried, and it didn't return any charset information
(as part of the Content-Type header).  Very naughty of it.  My web
browser will presume that it's probably iso-8859-1, but that's a
user-setting.

$ lynx --head http://www.google.com.au/

HTTP/1.0 200 OK
Cache-Control: private
Content-Type: text/html
Set-Cookie: PREF=ID=0464941a5b709653:TM=1172074296:LM=1172074296:S=TAa056r0feal
vnL6; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com.au
Server: GWS/2.1
Content-Length: 0
Date: Wed, 21 Feb 2007 16:11:36 GMT
Connection: Keep-Alive


And here's another example, this time it told me to expect iso-8859-1:

$ lynx --head http://www.optus.com.au/

HTTP/1.1 302 Found
Set-Cookie: LBPRDPROXYEXT=346628f8346628f4baeebad6; path=/
Date: Wed, 21 Feb 2007 16:13:31 GMT
Server: Apache/2.0.52 (Win32) mod_ssl/2.0.52 OpenSSL/0.9.7e
Location: http://www.optus.com.au/portal/site/oca
Connection: close
Content-Type: text/html; charset=iso-8859-1



MSIE will sometimes be preset to presume something, rather than pay
attention to the server.  It's also known to make stupid guesses.

The "default" charset will be used when nothing else has preselected a
particular one to be used.  Though, I'd recommend using UTF-8, it's
created to replace numerous other encoding schemes with the one thing.

It's also possible that individual HTML files might declare themselves
to be a certain in encoding scheme, by a meta statement in the head of
the HTML.  Though this is only to be paid attention to if the HTTP
server didn't already specify the encoding in the HTTP headers.

Hierarchy:
     1. Pay attention to the HTTP headers, no matter what.
     2. If there's no charset HTTP header information, look at a meta
        statement.
     3. If there's none of the above, it's up to the user to work out
        what to do (they could preselect a default, configure their
        browser to assess the page and make a guess, or the browser
        might just presume iso-8859-1).

The user can also preselect an encoding type, to override information
provided by the server.  That allows you to read stuff that's been
incorrectly identified.

-- 
(This PC runs FC4, my others FC5 & FC6, in case that's important
 to the thread)

Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]