Re: Encoding issues with literal strings (C++)
- From: Carlos Moreno <moreno_at_mochima_dot_com@xxxxxxxxxxxxxx>
- Date: Wed, 20 Dec 2006 18:10:28 -0500
Pascal Bourguignon wrote:
John Fusco <fusco_john@xxxxxxxxx> writes:
They're both the same problem. I'm not sure if this is a bug or not,
but gcc is taking more than two digits to make a string literal. In
your example:
"espec\xEDficos"
Here gcc is taking the literal as 0xedf, which is out of range. The
modulo value of 0xdf is what shows up in your output. I confirmed this
behavior in gcc 3.4.4.
Again, I always thought C only uses two digits for \x escapes, so this
smells like non-conformance to me. However, you can work around it by
terminating the sequence with whitespace, or you can make it two
strings as follows:
"espec\xED""ficos"
This is valid C syntax. The compiler will concatenate these two
strings and produce the correct characters.
What I would do, is to keep my sources encoded in utf-8, and just be
sure to output the HTML with the right "Content-type:...;charset..."
and META tag.
Except that who said that the output is HTML?
If it was HTML, I'd rather encode it in HTML, and not in UTF-8;
that is, I would have written that as: específicos (which
is, BTW, how I write it whenever I need to write an HTML document
containing Spanish text).
I was, in fact, considering the possibility of having my application
decode (at run-time) the literal string containing HTML entities,
or some other encoding; even URL-encoding, perhaps --- just a %
instead of a \x .
Thanks,
Carlos
--
.
- References:
- Encoding issues with literal strings (C++)
- From: Carlos Moreno
- Re: Encoding issues with literal strings (C++)
- From: John Fusco
- Re: Encoding issues with literal strings (C++)
- From: Pascal Bourguignon
- Encoding issues with literal strings (C++)
- Prev by Date: Re: Encoding issues with literal strings (C++)
- Next by Date: coprocess question
- Previous by thread: Re: Encoding issues with literal strings (C++)
- Next by thread: Re: Encoding issues with literal strings (C++)
- Index(es):
Relevant Pages
|