Encoding issues with literal strings (C++)




Hi,

I'm a bit puzzled by the following.

My application is a client/server, where the server runs on Linux
and is written in C++. The client runs on Windows and is written
in Borland C++ Builder 6.

Since it is in Spanish (most of the users are hispanophones), I
have many messages that the server sends that include characters
with accent (in HTML, á é , etc.).

Some of these messages come from literal strings, with embedded
\x sequences to represent the special characters in ISO-8859-1
(or rather, Windows-1252).

For instance, in LATIN1 (ISO-8859-1) and in Windows-1252 encodings,
the a with acute accent has the code 0xE1; the o with acute accent
has code 0xF3 ... So I write those just like that (well, \xE1 and
\xF3 in the literal strings), and it works.

But I have two puzzling problems:

1) When I write the i with acute accent (which has code 0xED), that
one doesn't work (shows up as a greek letter beta on the client,
and the letter after that one doesn't show).

When I do a hexdump -C of the executable, I see that the string
is not the same!!! The \xED character has been replaced by a
0xDF, and the character after the \xED is missing !!! Here:

The literal string is: " ..... espec\xEDficos ..... "

The hexdump output (the relevant line) is:

65 73 20 65 73 70 65 63 df 69 63 6f 73 2e 20 20 |es espec.icos. |

Why did that happen? How do I avoid it? --- without having to
manually edit the executable, that is). I have the feeling that
it has to do with UTF-8 encoding, perhaps invalid UTF-8 sequences
that the compiler is "fixing" --- but, if that is the case, why?


2) The other thing is that I'm getting a compiler warning of hex
escape sequence out of range for the \xF3 --- yet that character
shows up ok (the o with acute accent).


Thanks for any ideas!

Carlos
--
.



Relevant Pages

  • Re: How do I stop a Winsock from buffering characters?
    ... from the client code and have the server see it right away. ... I don't see the server reads that you're talking about as having dropped ... first character of the client send. ... public delegate int MessageFn(long MessageType, ...
    (microsoft.public.windowsce.embedded)
  • Re: How do I stop a Winsock from buffering characters?
    ... Windows CE side and receive ASCII on the server side. ... Unicode characters will be 0x00xy where xy makes up the ASCII character, ... I have two apps one is a Winsock Client as in the following: ... public delegate int MessageFn(long MessageType, ...
    (microsoft.public.windowsce.embedded)
  • Re: How do I stop a Winsock from buffering characters?
    ... The client is just doing this to send: ... Stream s = client.GetStream; ... I just wan to send one character ... from the client code and have the server see it right away. ...
    (microsoft.public.windowsce.embedded)
  • Re: How do I stop a Winsock from buffering characters?
    ... Sorry all my server code did not make it in the post here is is. ... The fist character does not show up the rest do ... first character of the client send. ... public static void Main ...
    (microsoft.public.windowsce.embedded)
  • Re: How do I stop a Winsock from buffering characters?
    ... Sorry all my server code did not make it in the post here is is. ... The fist character does not show up the rest do ... I have two apps one is a Winsock Client as in the following: ... public static void Main ...
    (microsoft.public.windowsce.embedded)