Re: A Great Idea (tm) about reimplementing NLS.

From: Patrick McFarland (pmcfarland_at_downeast.net)
Date: 06/16/05

  • Next message: Patrick McFarland: "Re: A Great Idea (tm) about reimplementing NLS."
    To: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Date:	Wed, 15 Jun 2005 21:49:05 -0400
    
    
    

    On Monday 13 June 2005 03:20 pm, Alan Cox wrote:
    > An ext3fs is always utf-8. People might have chosen to put other
    > encodings on it but thats "not our fault" ;)

    What happens if you 'field upgrade' ext2 to ext3 by adding a journal? That
    doesn't magically convert !utf-8 to utf-8.

    > There are some good technical reasons too
    >
    > Encodings don't map 1:1 - two names may cease to be unique

    Hold up. Unless the original encoding is 'wrong' and has two mapped characters
    that, in reality, are the same character, no such uniqueness should stop.
    (This implies the encoding that we switched to 'fixed' said 'bug')

    > Encodings vary in length - image a file name that is longer than the
    > allowed maximum on your system with your encoding choice - that could
    > occur with KOI8-R to UTF-8 I believe

    Thats a fault of the file system design, not of the encoding. File systems
    should not have very short filenames.

    -- 
    Patrick "Diablo-D3" McFarland || pmcfarland@downeast.net
    "Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd 
    all be running around in darkened rooms, munching magic pills and listening to
    repetitive electronic music." -- Kristian Wilson, Nintendo, Inc, 1989
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



  • Next message: Patrick McFarland: "Re: A Great Idea (tm) about reimplementing NLS."

    Relevant Pages

    • Re: A Great Idea (tm) about reimplementing NLS.
      ... would translate filenames form utf-8 stored on the media to e.g. latin2 ... one encoding for a particular language ... The filesystem is already populated with UFT-8 names. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: UTF-8 and case-insensitivity
      ... > the Linux kernel will need to efficiently support a userspace policy ... and say: UTF-8 is *the* encoding. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: automating the SQL warning and the choice of text format
      ... automatically select 'yes' and 'utf-8' rather than changing the registry, ... In order to get the correct encoding, I believe that you have to do the ... You need one of those for each data source. ... For a comma-delimited file using UTF-8 encoding, ...
      (microsoft.public.word.mailmerge.fields)
    • Re: Any portable way get a filename in UTF-8 or to get the FS encoding ?
      ... So I download and read sus v2 and sus v3 to see the openddir/readdir/closedir functions, but they only return charstrings for file names and they say nothing about the encoding of the file names. ... I have seen _wreaddir function in some implementations, but is there a portable way to get a file's name in UTF-8 or to get a file name in the underlaying encoding of its file system and to get the encoding? ... A filename is just a NUL terminated string which is completely compatible with UTF-8. ...
      (comp.unix.programmer)
    • Re: PEP 263 status check
      ... > chosing windows-1252 as the source encoding. ... in the string module, the string methods and all through ... encoded data (including utf-8 encodings) ... character that is outside of the 7-bit ascii subset. ...
      (comp.lang.python)