Re: [kde-linux] Encoding questions (Chusslove Illich)
- From: Emanoil Kotsev <deloptes@xxxxxxxxx>
- Date: Thu, 12 Jun 2008 07:12:54 -0700 (PDT)
On Monday 09 June 2008 08:14 am, Emanoil Kotsev
wrote:
The encoding for the merriam-webster page seems tobe
iso8859-1.
The site is definitely is8559-1 encoded.
One thing I noticed the other day, but forgot to
mention: Yes, there is
something on that page that seems to say the page is
encoded in iso8859-1:
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1" />
But elsewhere on the same page there are lines that
suggest that at least some
part of it might be encoded in utf-8:
google_afs_ie = 'utf8'; //
select input encoding scheme
google_afs_oe = 'utf8'; //
select output encoding scheme
these are properties/arguments of the googleads
function. I guess it is used to convert the encoding
of the data into the one you are using
(http://www.merriam-webster.com/dictionary/intelligent):
My guess is that the definition is fetched from some
database and displayed
using utf-8. (On the other hand, maybe the utf-8 is
only for google ads or
similar displayed on that page?) (A further guess
is that, if the
pronunciation key were displayed in iso-8859-1 ...
Well, wait--the one clue I have is that if I C&P the
definition from konqueror
to kate, with kate changed to a font that can
display the correct glyphs (the
upside down e, for example), the pronunciation key
is displayed correctly in
kate. Would that work if the encoding on the
konqueror page was iso-8859-1,
or only if it was utf-8? I'm not sure, and don't
desperately need to know at
the moment. ;-)
Just for reference, here is a C&P of the
pronunciation "key" from one m-w page
Pronunciation: \in-?te-l?-j?nt\
I guess I just wanted to note that there is some
uncertaintly, at least in my
mind, as to whether the definition on the m-w.com
pages is encoded in
iso-8859-1 or utf-8. If it is encoded in
iso-8859-1, could it be displayed
properly if C&P'd into kate?
Look at the source code of the page and you'l find the
secret:
<dt class="pron">Pronunciation:</dt>
<dd class="pron">
<span class="pronchars">\in-<span
class="unicode">ˈ</span>te-lə-jənt\</span>
</dd>
this means they use the W3C recomendation for encoding
characters in html from the unicode definition.
Welcome to the encodings hell!
regards
___________________________________________________
This message is from the kde-linux mailing list.
Account management: https://mail.kde.org/mailman/listinfo/kde-linux.
Archives: http://lists.kde.org/.
More info: http://www.kde.org/faq.html.
- Follow-Ups:
- Re: [kde-linux] Encoding questions (Chusslove Illich)
- From: Randy Kramer
- Re: [kde-linux] Encoding questions (Chusslove Illich)
- Prev by Date: Re: [kde] konqueror sharing bookmarks
- Next by Date: Re: [kde-linux] Encoding questions (Chusslove Illich)
- Previous by thread: Re: [kde-linux] Encoding questions (Chusslove Illich)
- Next by thread: Re: [kde-linux] Encoding questions (Chusslove Illich)
- Index(es):
Relevant Pages
|