Re: bash-script -- filename conversion of Whitespace, "'", ""
- From: "Enrique Perez-Terron" <enrio@xxxxxxxxx>
- Date: Tue, 06 Dec 2005 17:44:34 +0100
On Sun, 04 Dec 2005 18:32:13 +0100, Dietrich <Dietrich.Semsar@xxxxxx> wrote:
I have quite problems converting for several bash-scripts filenames and
pathnames.
The errors or inconsistencies occur both in the Debian and cygwin
system:
System Debian:
bash GNU bash 3.00.0(1)-release
awk GNU Awk 3.1.4
System cygwin:
bash 2.05b
awk GNU awk 3.03
For a script invoking lame I need converting of file names as follows:
name_'_with_single_quote.wav -> name_\'_with_single_quote.wav
name with whitespace.wav -> name\ with\ whitespace.wav
And every possible combination of these.
For another script I need converting pathnames like
"C:\witless\anypath" to "file://C:/witless/anypath" but that won't work
in a script just like converting "'" won't work. But see for details
below.
Converting whitespaces is the only one that works. I used a funktion
for that using a "while":
--snipp--
function read_test()
{
if [ "$#" -gt 1 ]; then
read_test_return="${1}"
shift
while [ "$#" -gt 0 ]
do {
read_test_return="${read_test_return}\ ${1}"
The backslash in front of the space is redundant. the quotation marks
already quote that space.
shift
}
done
echo ${read_test_return}
else
echo $1
fi
}
--snapp--
This could be done by a sed as well, but at least it works :-) .
The code above has a weakness, though: if a file name has more
than one space character in a row, the result will only have
one backslash-space combo. There are also (concievably) file
names with tabs in them and with newlines (ugh!).
The whole thing is really not the right way to go about it.
How will you use the function? Are you writing scripts
where the file names are directly given in the script?
Someting like
file_one=$(read_test Oh my love.wav)
musicplayer "$file_one"
Is it?
But why not make it just
musicplayer "Oh my love.wav"
protecting the spaces right at the source?
Or, perhaps you are actually having the script find the file names
on the disk, by listing directories or something similar?
for filename in *
do
musicplayer "$filename"
done
When the shell does "pathname expansion", converting the "*" character
to a list of file names, the file names will not be split into words,
at least on in the "for filename in ..." part of the script. The
parameter "filename" will assume in turn each file name in the current
directory, with spaces and all. Apostrophes too.
However, in the third line, if you omit the double quotation marks
around "$filename", yes, then you loose. Then the "musicplayer"
program, when it starts, will find a command line whith four
separate words on it, "musicplayer", "Oh", "my", and "love.wav".
But make sure to include the quotation marks, and everything works.
# echo "t'e'st" | sed -e "s/\'/'/g"
t'e'st'
# The "'" at the end is quite strange!
Look, the shell removes the enclosing double quotation marks, and leaves the
rest unchanged, for the programs "echo" and "sed" to see.
Sed thus sees s/\'/'/g
\' means "the empty string at the end of the buffer". This empty string
is replaced with an apostrophe. Try:
$ echo "t'e'st" | sed -e "s/\'/,/g"
t'e'st,
You probably need to understand better the quoting rules of the shell.
Characters between a pair of unquoted double quotation marks become
quoted characters; the enclosing double quotation marks are removed.
However, the charcters \, $, and " are special, these do not become
quoted. To quote these, add another \ in front of each occurrence.
So far you probably knew it all.
But what happens if you place a backslash in front of a character that
does not need quoting? The backslash is retained and passed on to the
program. An apostrophe does not need any backslash in front of it to
become quoted in the sense of the shell, when the apostrophe appears
within a pair of unquoted double quotation marks. Therefore, the
backslash in front of the apostrophe is not removed. It is passed
on to sed. And to sed, that backslash-apostrophe combination has a
special meaning.
# echo "t'e'st" | sed -e "s/'/\\'/g"
t'e'st
First, notice that the shell in this case does remove the first backslash.
It removes it because it appears in front of a character that may need it
to become a quoted character. The second backslash. So this time sed sees
\' as the replacement string. But the special meaning "the empty string
at the end of the buffer" only applies in the first half of the s///
expression. In the second half, in the replacement string, it is just an
apostrophe redundantly escaped by a backslash. So in this case sed
substitutes an apostrophe for each of the apostrophes in the input. In
other words, no visible change, in spite of two substitutions.
#echo "t'e'st" | sed -e "s/'/\\\'/g"
t\'e\'st
That's the thing I want. But how strange... Orginal text >'< may not be
quoted, replacement text >'< must be quoted... #-P
Exactly. And the replacement must be doubly quoted, once because of the shell,
and once for sed. Sed needs to receive a replacement \\' to make it \'.
Ideally, you should have written \\\\' for the shell to reduce it to \\'
and sed to reduce it to \'.
Converting "'" to "\'" and "\" to "/\" won't work with sed. At last
given to the shell directly works:
# echo "Rumours--04--Don't\ Stop.wav" | sed -e "s-'-\\\'-g"
Rumours--04--Don\'t\ Stop.wav
But used in the script:
echo `echo "${read_test_return}" | sed -e "s-'-\\\'-g"`
produces that:
Rumours--04--Don't\ Stop.wav
This is interesting. The behaviour changes if you use $() instead
of backticks, and I don't know why :)
I used "-" as alternative separators only to have it better readable
cause "\" is used in the replacement string as well. Invoking sed with
"\" instead produces the same result, of course.
You mean "/".
But why doesn't it work in the script? Using the >"< should do the
trick? Any ideas?
The people at comp.unix.shell are better than me at this.
-Enrique
.
Relevant Pages
- Re: Unrecognized escape sequences in string literals
... character without knowing the context, ... '\n' maps to the string chr. ... In both cases the backslash in the literal have the same meaning: ... escape every backslash, ... (comp.lang.python) - Re: New Line Constant Error!
... > make a special character that you can't type into your program. ... > tries to combine the backslash with the following character to create ... which the compiler reads as a special sequence that folds down into ... because the sequence backslash-en is ... (microsoft.public.dotnet.languages.csharp) - Re: Eigenartiges Verhalten von "String[] java.lang.String.split(String regex)" - Falscher Split
... "It is an error to use a backslash prior to any alphabetic character ... Alle anderen Backslashes sind nicht erlaubt und ... musst du das mit ... (de.comp.lang.java) - Re: gfortran diagnostics and so on
... Well, in f0003, backslash is part of the standard Fortran character set. ... Because the backslash is part of the standard Fortran character set, the default behavior should be the printable character, **NOT** some kind of magic introductory character that transforms the interpretation of following character. ... The one I like best is to use one of the popular extensions to designate a particular literal string according to the C language. ... (comp.lang.fortran) - Re: New Line Constant Error!
... it is an "escape" that combines with the character after it to ... So, any time you type a string and put a backslash \ in it, the ... which the compiler reads as a special sequence that folds down into ... special-character processing on this one. ... (microsoft.public.dotnet.languages.csharp) |
|