[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: utf-8 typing problem in X



Carlo Nyto wrote:
2008/11/7 Gordon Messmer <yinyang eburg com>:
I think you should be using LANG="ja_JP.UTF-8"

You think my problem is because I have it set to Japanese, or that I
should change it to Japanese to get proper English text?

I think you have *something* set to use Japanese locale. Your messages come through with these MIME headers:

Content-Type: text/plain; charset=ISO-2022-JP
Content-Transfer-Encoding: 7bit

Your messages aren't UTF-8 at all. They're sent in a completely different encoding, and a Japanese character set. That's probably a setting in your browser. If you're using Firefox, check View menu -> Character encoding.

But if LANG isn't set to a Japanese locale, and none of the LC_ variables are set, I'm not sure where else to look. If you're using GNOME, you can try System -> Preferences -> Personal -> Input Method, and see if an input method is enabled. You could also check System -> Preferences -> Hardware -> Keyboard -> Layouts (Tab) and see how your keyboard layout is set.

I can tell you how to change the encoding of a file, but I'm not aware of
any program that can shift characters to different unicode points. The
problem isn't that the system is displaying your characters badly, it's that
the characters are being entered as fullwidth latin characters rather than
regular ascii.  They look similar when printed, but they're not the same
unicode characters.

I agree this conversion would be a very difficult - perhaps impossible
- problem to solve. Do you mean "fullwidth" as in a multi-byte UTF-8
character?

Not at all. I mean that there is a Unicode character called "Fullwidth Latin capital letter T" which is an entirely different character (it's codepoint FF34) than the standard latin capital letter "T" (codepoint 0054, and ascii hex value 54).

That is in fact my problem. The fact that they are rendered
incorrectly

Technically, they're rendered correctly. The problem is not rendering, but input. The characters are being input in the wrong locale.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]