Character encoding

Adil Drissi adil.drissi at yahoo.com
Sun Sep 7 02:39:18 UTC 2008


Hi, 

Thank you for your answer. I want to use this in my personal computer. 
Can you give me the name of the variable please? Say I will set that variable to UTF-8 in /etc/profile, do you think that vim will always save my files in utf-8 format?

Another thing, a lot of editors allow to choose the text encoding format, and that what i want to be set by default to utf-8. I know that in my html code i have to set manually.

Thank you

--- On Sat, 9/6/08, Björn Persson <bjorn at rombobjörn.se> wrote:

> From: Björn Persson <bjorn at rombobjörn.se>
> Subject: Re: Character encoding
> To: adil.drissi at yahoo.com, "Community assistance, encouragement, and advice for using Fedora." <fedora-list at redhat.com>
> Date: Saturday, September 6, 2008, 11:56 PM
> Adil Drissi wrote:
> > I want to know what is the encoding type of a file. So
> i run this command:
> > "file --mime index.php". The output is :
> index.php: text/html
> >
> > But this does not give any character encoding type.
> >
> > I would like to convert this file to UTF-8 but the
> command convmv cannot be
> > run without specifying the type of the file with -f
> option i think.
> 
> There is no general way to find out the character encoding
> of a random piece 
> of data. Some encodings are fairly easy to recognize but
> the numerous 
> eight-bit encodings can be difficult to tell apart. The
> character encoding 
> must always be specified somewhere if it isn't
> implicitly known.
> 
> In some file systems it's possible to specify the
> character encoding of a file 
> as an attribute, but I've never seen it used. HTML can
> contain a meta tag 
> that specifies the encoding, like this:
> 
> <meta http-equiv="Content-Type"
> content="text/html; charset=utf-8">
> 
> If the HTML file is served by an HTTP server, then the
> server can specify the 
> encoding in the Content-Type header, and there are rules
> that define what the 
> encoding is if the server doesn't specify it.
> 
> You could open the file in a browser that lets you choose
> the encoding, and 
> try an encoding that you think it may be. Then proofread
> the text. If all the 
> characters are right, then you guessed right, or close
> enough to work for 
> that particular file. If not, try the next encoding.
> 
> > o is there a way to convert this file to UTF-8
> 
> Once you know the current encoding, transcoding won't
> be a big problem. If the 
> encoding is specified in the file, such as in a meta tag,
> then you'll have to 
> change that too.
> 
> > or better how to set the default character encoding to
> utf-8?
> 
> Default in what context? The locale settings in the
> environment include a 
> character encoding. Many programs assume that text files
> and filenames are 
> encoded in that encoding, but some programs think
> they're smarter and assume 
> something else. (The approach with environment variables
> will of course fail 
> if different users use different locales and access the
> same files.)
> 
> Björn Persson


      





More information about the fedora-list mailing list