Re: [kde] Character sets / encoding



Anne Wilson wrote:
On Monday 07 September 2009 17:08:25 Carlos Luna wrote:
I think this could help you. (It's for config locales in your system)
http://www.adslayuda.com/Linux-locales.html

Thanks, but I do have the correct locale installed, and I use utf-8, which
should, as far as I know, handle all European accented characters without a
problem. In fact in many applications it does.

After some further research, I am starting to think that this is a bug.
The reason is that it isn't just IBM cp1252 that is screwed up but
also ISO 8859-1 has the same problem.

When a text file composed in either code page which contains characters
= 128 (>7F Hex) is opened wit UTF-8, it fails to properly decode the
glyphs >= 128. This happens despite the fact that the apps which I have
tried correctly count the number of glyphs before changing them to the
FFFD Hex character ?.

The >=128 glyphs which I commonly user are: äëïöüñ. Since I am sending
this email in ISO 8859-1, these characters will not appear correctly if
viewed with UTF-8.

I have found that the only solution to this problem is to set the code
page for incoming mail to either ISO 8859-1 or IBM cp 1252.

--
JRT



___________________________________________________
This message is from the kde mailing list.
Account management: https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.



Relevant Pages

  • Re: [kde] Character sets / encoding
    ... viewed with UTF-8. ... page for incoming mail to either ISO 8859-1 or IBM cp 1252. ... If the characters you typed were umlauted, ... wants to show the bits from the net in a readable form) which Charset (and ...
    (KDE)
  • Re: [kde] Character sets / encoding
    ... On Wednesday 09 September 2009 21:35:05 James Tyrer wrote: ... Thanks, but I do have the correct locale installed, and I use utf-8, ... also ISO 8859-1 has the same problem. ... When a text file composed in either code page which contains characters ...
    (KDE)
  • Re: HTML encoded or decoded?
    ... The reason I prefer UTF-8 is that with ISO you can't mix ... And in that case UTF-8 might use 2 or more bytes to encode the ... characters you use. ... editors (the editor has to know it's UTF-8 AFAIK, ...
    (alt.internet.search-engines)
  • Re: [kde] Character sets / encoding
    ... also ISO 8859-1 has the same problem. ... When a text file composed in either code page which contains characters ... viewed with UTF-8. ... page for incoming mail to either ISO 8859-1 or IBM cp 1252. ...
    (KDE)
  • Re: DB2 UTF-8 ODBC double conversion
    ... UTF-8 *is* Unicode. ... byte to store characters in the 7-bit ASCII code. ... If I give a UTF-8 string to CreateFile, ... this means that everyone who is using that database has to understand that the ...
    (microsoft.public.vc.mfc)