Re: [kde] Character sets / encoding
- From: Patrick Nagel <mail@xxxxxxxxxxxxxxxxx>
- Date: Thu, 10 Sep 2009 14:54:25 +0800
Hi Anne,
On 2009-09-10 14:26, Anne Wilson wrote:
On Wednesday 09 September 2009 21:35:05 James Tyrer wrote:[...]
The >=128 glyphs which I commonly user are: äëïöüñ. Since I am sending
this email in ISO 8859-1, these characters will not appear correctly if
viewed with UTF-8.
I have found that the only solution to this problem is to set the code
page for incoming mail to either ISO 8859-1 or IBM cp 1252.
Not sure what's happening James. If the characters you typed were umlauted,
as they seem to be, then they are reading correctly on this netbook (I'll
check on another machine later). Here KMail is set to use the following
utf-8
utf-8 (locale)
us-ascii
iso-8859-1
Now whether that means that if one doesn't fit it falls back to the next one, I
don't know. What do you think?
The real problem with charsets and encodings is, that you always have to tell
the interpreting program (Browser, Mail/News reader, ... whichever program
wants to show the bits from the net in a readable form) which Charset (and
encoding) has actually been used to encode the message, so that it can choose
the matching decoder.
If this information is not given, there is no other way than guessing. And
everybody knows that computers are not good at that. How would a computer know
how the string 'äëïöüñ' from James should actually look like, if he hadn't had
specified the encoding in the header (open the source code of his mail, and you
will see the following line: Content-Type: text/plain; charset="iso-8859-1").
The computer could then (for example) have guessed that those bits were
supposed to mean "潆秭" ("eddy billion" in Chinese)... Ok, I admit, I cheated a
bit on this one - it wouldn't have been a valid bit sequence for a GBK decoder,
which any sane guessing algorithm would have detected... but still, I think you
get the point.
So, people, use Unicode (the "universal charset") encoded as UTF-8 for
everything - and maybe in a few years we can all forget about all this
charset/encoding mess :)
Patrick.
P.S.: I used Unicode/UTF-8 in this mail (and of course it's specified in the
mail's header), otherwise it wouldn't even have been possible to put both
Chinese characters and umlauts in one mail.
--
Key ID: 0x86E346D4 http://patrick-nagel.net/key.asc
Fingerprint: 7745 E1BE FA8B FBAD 76AB 2BFC C981 E686 86E3 46D4
Attachment:
signature.asc
Description: OpenPGP digital signature
This message is from the kde mailing list.
Account management: https://mail.kde.org/mailman/listinfo/kde.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.
- Follow-Ups:
- Re: [kde] Character sets / encoding
- From: Anne Wilson
- Re: [kde] Character sets / encoding
- References:
- [kde] Character sets / encoding
- From: Carlos Luna
- Re: [kde] Character sets / encoding
- From: Anne Wilson
- Re: [kde] Character sets / encoding
- From: James Tyrer
- Re: [kde] Character sets / encoding
- From: Anne Wilson
- [kde] Character sets / encoding
- Prev by Date: Re: [kde-linux] Superaramba themes multiply upon each invocation of X
- Next by Date: Re: [kde-linux] Korganizer no longer opens from system tray
- Previous by thread: Re: [kde] Character sets / encoding
- Next by thread: Re: [kde] Character sets / encoding
- Index(es):
Relevant Pages
|