Re: Apache unicode question
From: prg (rdgentry1_at_cablelynx.com)
Date: 04/19/05
- Next message: Alan Connor: "Re: Look At These Fools"
- Previous message: SINNER: "Re: Newbie question: problems setting DISPLAY to X server"
- In reply to: alex: "Apache unicode question"
- Next in thread: alex_at_alexfeldman.org: "Re: Apache unicode question"
- Reply: alex_at_alexfeldman.org: "Re: Apache unicode question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 19 Apr 2005 09:13:58 -0700
alex wrote:
> If anyone here can answer this, I would appreciate it. Failing that,
> please point me to a better place to ask - I couldn't find one, nor a
> FAQ which deals with this.
>
> I am running Apache from Fedora Core 3. Some of the pages that have
> been created for the server use unicode characters. ...
What do you mean by "use unicode characters"? Hardcoded, numeric,
Unicode representation? Non-ASCII characters?
> ... I don't know why
> they were created this way, we get lots of contributors to this site,
And thus a number of authoring platforms, tools, etc. all with their
own quirks/deficiencies/shortcuts :(
> and the person who was in charge of this died recently. But in any
> event, they were. The problem comes up with puctuation marks, which
> come up correctly in some browsers, but not in others. ...
Which browsers/versions. Do you really want to support every browser
quirk currently known to "force" a particular rendering?
> ... One of the
> pages that has the problem is
>
> http://diamond.boisestate.edu/gas/whowaswho/G/GrossmithGeorge.htm
>
> You should see, in some browsers, odd marks that are supposed to be
> quotes, apostrophes, etc.
"odd marks" means what? Squares? Asterisks? Platform or font
substitutes? Blanks?
Please give us specific character encodings used (ie., charset,
codepage, OS platform, etc.). These are absent from the above page, so
each browser is likely to "do it's best" according to how the user has
configured the browser.
> Is there some switch I can throw so that these get interpreted
> "correctly", or at least differently, as they are viewed? ...
What do you mean by "correctly"? Do you want _typographical_
apostrophe and quote? Just ASCII?
> ... Looking at
> the file under od, these "problem characters" are the only ones that
> have exented unicode encodings - all the others are straight ascii.
Do you mean running the file as stored on the server through od? As
delivered to some clients? What char codes does it show?
> I have a feeling the page in question may have been created in
Europe,
> while most of our pages were created in the US, if that is relevant.
The chars mentioned are rendered differently by my browser (Konqueror
on Linux) in "automatic modes". In "manual" modes (View|Set Encoding)
it varies even more depending on what I select ;)
Depending on the authoring tools and how they were configured and/or
their default behavior, this could present problems/inconsistencies
with some chars in the source files. How they will be interpreted may
be a crap shoot;)
There are ways to _attempt_ to force client browsers to render them as
"extended, typographical" chars, but the users' browsers may:
-- use a different, specified, default charset (encoding)
-- may not have an appropriate font (substitute) to render them
-- may simply ignore "instructions" from the server despite all your
efforts
-- may have a bug
These and some other chars are particulary difficult to handle,
especially without some MS TrueType fonts on the computer. These map
on Windows into the extended ANSI range in page positions peculiar to
Windows. But even the Unicode standard has trouble with these ;)
I did not look, but I seem to recall some scripts running around that
will "sanitize/scrub" source files looking for problematic chars and
inserting the site "standard". Perl?
Here's some links that _may_ help somewhat or at least provide some
ideas on how to achieve site wide consistency (without CSS).
http://www.w3.org/TR/REC-html40/charset.html
http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
http://glasnost.beeznest.org/articles/139
http://httpd.apache.org/docs/mod/core.html#adddefaultcharset
http://httpd.apache.org/docs/content-negotiation.html
http://www.i18nguy.com/markup/serving.html
Googles:
http://www.google.com/search?&q=html%20encode%20unicode
http://www.google.com/search?&q=apache+html+encode+unicode
http://www.google.com/search?&q=apache+html+charset
good luck,
prg
- Next message: Alan Connor: "Re: Look At These Fools"
- Previous message: SINNER: "Re: Newbie question: problems setting DISPLAY to X server"
- In reply to: alex: "Apache unicode question"
- Next in thread: alex_at_alexfeldman.org: "Re: Apache unicode question"
- Reply: alex_at_alexfeldman.org: "Re: Apache unicode question"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|