character encoding



When I run 'ls' on a given directory, some of the file names show a question
mark in the place of a non-supported character. In trying to understand
what is happening, I find that I don't understand a couple of fundamentals.

1) what is the default encoding of my debian system?

2) It seems that a file itself doesn't have any encoding as it is sitting on
the hard drive -- its just bytes, right? when a given application picks it
up, that application will try to read it as a certain encoding -- how is
that determiniation made?

3) What is the encoding of the file name? Is this a feature of the
filesystem?

I realize these questions may not be that "smart"; please tell me what I'm
missing if so. Also, point me to documentation if you know of some that
explains all of this. I couldn't find anything on the topic searching the
web or debian docs.


Relevant Pages

  • Re: Wrong encoding since fresh etch install
    ... Debian system. ... to fr_FR.UTF-8 UTF-8 ... and it's even worse: I can't even type accentued character ... How can I know which encoding were used for those dir? ...
    (Debian-User)
  • Wrong encoding since fresh etch install
    ... Debian system. ... my /home directory was not touched by the corrupt thing (it was on another ... How can I fix this encoding pb for all my system, ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
    (Debian-User)
  • Re: requestEncoding = "ISO-8859-1"
    ... "Mark" is far to bland. ... I placed the META tag in the HTML form file. ... I need to pick an encoding that is capable of representing the ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: KOREAN LANGUAGE NOT FULLY UNINSTALLED
    ... I forgot to mention the encoding issue. ... >> Mark L. Ferguson ... >> marfers notes for windows xp ... >>> Thanks, victor ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Finding out a file encoding
    ... And then test the first bytes in your file against these bytes for each possible encoding. ... This may work for unicode encoded files since I believe they add a mark to the file to specify endianess. ...
    (microsoft.public.dotnet.languages.csharp)