UTF8 settings (was: Can scp be used to update a directory?)

Anne Wilson cannewilson at tiscali.co.uk
Sat Mar 25 09:48:26 UTC 2006


On Friday 24 March 2006 13:34, James Wilkinson wrote:
>
> A backup from an FC3 machine listed
> SUPPORTED="en_GB.UTF-8:en_GB:en:en_US.UTF-8:en_US:en"
> although I doubt both en references are strictly necessary.
>
OK, I've altered the file.

> > Here's a sample -
> >
> > ../Mp3/marisa_monte/rose_and_charcoal/06_dan�_da_solid�.mp3
> >
> > The title should read
> >
> > 06_dança_da_solidäo.mp3
>
> That's actually a different symptom of the same problem. UTF8 takes two
> bytes to store most common non-ASCII characters, whereas the ISO-8859
> family always uses one byte.
>
> What you first described was seeing the two UTF8 bytes in an ISO-8859
> program, so each accented character shows as two ISO-8859 characters
> (some of which will probably be "illegal", so you'll see spaces or
> something similar there).
>
It's quite possible that the two different displays were because, when first 
attempting to troubleshoot this, I experimented by setting different 
character sets in kde.

> What you've just illustrated is an ISO-8859 name viewed in an UTF-8
> environment, where two ISO-8859 characters are interpreted as one
> illegal UTF-8 character.
>
> My first reaction is to blame the generating program (what was it?)

Grip generated the mp3s.  I first saw the problem in k3b, but then in 
konqueror and kmail, all under FC4.

> In 
> my experience, many MP3 programs, following Winamp's example, have gone
> flat-out for skins and custome text-handling. Too many of them don't
> support UTF8 in $LANG properly.
>
> Alternatively, what did the server box use to run? How did you transfer
> the files? Red Hat went to UTF-8 early, and many other distros took a
> lot longer to upgrade. And transferring files might not get the
> conversion right.
>
It was running Mandriva 10.0.  In truth, though, I can't remember whether the 
box that generated the files was running Mdv 10.1 or 10.2.  I don't think 
10.0 had utf-8 (could be wrong) but it's very likely that I never elected to 
use utf-8 when it first became available.

> (You used to use Mandriva, didn't you? I'm not sure when they adopted
> UTF-8...)
>
> I wrote:
> > As for the single e-mail -- I'd blame the other end, personally.
>
> Anne said:
> > Maybe.  Maybe he has the same problem as I do.
>
> Um. Mail clients have no business not knowing which encoding they're
> using. And if they know that, they've no business not putting it into
> the headers of outgoing e-mail properly.
>
> We've proved that your e-mail client can receive UTF-8. I suppose
> there's still the chance that your correspondent used a weird encoding
> that your client didn't understand. But you're not going to get the
> "right" message anyway in those situations, except by blind luck.
>
Well thanks for the insights I've got, anyway.  And finding convmv was another 
good thing to come out of it.  It all helps.

Anne
>
> --
> E-mail address: james | In the Royal Air Force a landing's OK,
> @westexe.demon.co.uk  | If the pilot gets out and can still walk away.
>
>                       | But in the Fleet Air Arm the outlook is grim,
>                       | If your landings are duff and you've not learnt to
>                       | swim.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/fedora-list/attachments/20060325/84dd23d9/attachment-0001.sig>


More information about the fedora-list mailing list