Re: [opensuse] uncompessing zip files and accented characters



Philipp Thomas wrote:

* Dave Howorth (dhoworth@xxxxxxxxxxxxxxxxx) [20100713 11:40]:

Is that wrong? What has unzip transformed the filenames into if it
hasn't preserved them?

Ok, once again:
Zip will write the names to the archive in whatever encoding the
originating machine uses. However it will *not* record the encoding
used. So in this case unzip will read the names encoded in say latin-2
(a single byte encoding) and will write them as utf8 (a multi byte
encoding) which of cause will result in the gibberish the OP posted.

Isn't it rather than unzip simply dumps whatever filenames that were
zipped, and that the terminal attempts to display those names as if
they are utf8?
Or does zip really convert from (for instance) latin-2 to utf8 ??



--
Per Jessen, Zürich (23.9°C)

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx



Relevant Pages

  • Re: LWP and Unicode
    ... until you understand Perl's Unicode handling better. ... Isn't there a way to tell LWP that the content is utf8? ... encoding supports many encodings. ... If the string already has the UTF8 flag on, ...
    (comp.lang.perl.misc)
  • Re: Character semantics for filenames (was: win32 reading wide filenames (unicode))
    ... Now file name is stored in utf8 format. ... it doesn't make any difference whether the string is internally ... DO WITH CHARACTERS ABOVE "\xFF". ... encoding to perl strings by readdir and from perl strings to the OS ...
    (comp.lang.perl.misc)
  • Re: utf8 Problems
    ... I converted to utf8 in the hope that my non ASCII character problems ... use all sorts of special characters, limited only by the fonts you have ... encoding in a standardized way, for example in plain text files. ... $ locale | grep -v en_US ...
    (Debian-User)
  • Re: Wie versende ich UTF-8 Mails aus einem Cron-Job?
    ... Umlaute) bis ich auf "UTF8" umstelle. ... Header das Encoding hervorgeht. ... Implementiert der Mailclient überhaupt den MIME-Standard? ...
    (de.comp.os.unix.linux.misc)
  • Re: LWP and Unicode
    ... How to read a web page containing partly utf8 and partly ... We want, for whatever reason, to have "use encoding ... does more than just decode the percent encoding. ... use Encode qw; ...
    (comp.lang.perl.misc)