Re: Translating \ escape codes

Jens.Toerring_at_physik.fu-berlin.de
Date: 05/19/05


Date: 19 May 2005 14:07:41 GMT

Måns Rullgård <mru@inprovide.com> wrote:
> Jens.Toerring@physik.fu-berlin.de writes:

>> Kasper Dupont <kasperd@daimi.au.dk> wrote:
>>> Is there some standard function to take one string
>>> and create a new one where \ escape codes have been
>>> translated like the C compiler does? None of the
>>> functions mentioned in man string.h seemed appropriate.
>>
>> None that I would know of. But you can have the one I once wrote for
>> exactly this purpose (what it doesn't do is trying to deal with tri-
>> graphs, and the handling of octal and hexadecimal values is a bit
>> more liberal than what's expected in C, i.e. what's following '\x'
>> can be either 1 or two hexadecimal chars instead of exactly 2 and
>> an octal number can consist of 1, 2 or 3 digits instead of just 3).

> That's what gcc does. I'm not sure what the standards say.

Just had another look at the standard and found that it actually
requires that behaviour, at least if I interpret

          octal-escape-sequence:
                  \ octal-digit
                  \ octal-digit octal-digit
                  \ octal-digit octal-digit octal-digit

          hexadecimal-escape-sequence:
                  \x hexadecimal-digit
                  hexadecimal-escape-sequence hexadecimal-digit

correctly (actually, after '\x' an unrestricted number of hex digits
can follow - but I couldn't find out what happens if the resulting
integer doesn'r fit into a char). That differs a bit from what one
might be let to assume from K&R2, Appendix A.

> As for
> your implementation, all those memmove() calls seem a little
> inefficient.

Could be - but otherwise I would have to allocate some temporary
string, copy the source string over one by one char, taking care
of escape sequences and afterwards copy the whole thing back to
the source string (I actually wanted to do it in place) and free()
the temporary string. So, at least when there are only very few
escape characters (which I assume to be the most common case), it
shouldn't be that inefficient (I hope,-).

                                  Regards, Jens

-- 
  \   Jens Thoms Toerring  ___  Jens.Toerring@physik.fu-berlin.de
   \__________________________  http://www.toerring.de