Re: Unix-ify File Names



Daniel B. <REMOVEdanielCAPS@xxxxxxx>:
Frank Terbeck wrote:
Mike McClain <mike.mcclain@xxxxxxxxxxx>:
Frank Terbeck <ft@xxxxxxxxxxxxxxxxxxx> wrote:

for FILE in `ls *$1` ; do
...
b) it breaks on filenames with spaces (and other special characters).
...> Using 'for i in `ls *`'-type loops breaks this and is one of the
main reasons why people think spaces are bad in filenames.
(They are not bad, ...

In what sense are they not bad? Yes, they're certainly legal per the
filesystem and most tools that take filenames. However, they and other
special characters do make it more difficult to handle arbitrary file
names.

No. They are never bad. It just takes a bit of practice to get used to
do things in a robust way.

For example, if someone wants to use ls's feature of sorting by date
(e.g., "ls -t *$1"), they cant combine it with the for-loop construct
above (reliably).

Okay, I admit that sorting is one of the rare cases where

[snip]
find . -printf '%Ts:%p\n' | sort -rn | cut -d: -f2 | while IFS= read -r ; do
...
done
[snap]

or

[snip]
IFS='
'
for i in `find . -printf '%Ts:%p\n' | sort -rn | cut -d: -f2` ; do
...
done
[snap]

loops are justified. At least in POSIX shell. I really didn't think of
sorting in my original mail. Thanks for noting. (But still you don't
use broken for loops.)

Note, that the for loop does _not_ use an external program with
globbing. And it only works with spaces, because of the changed $IFS
parameter. This may lead to unexpected results if it is not reset to
it's old value inside of the loop.

However, Bash, ksh and zsh users may still overcome this:

[snip]
oifs="$IFS"
IFS='
'
set -- x $(find . -printf '%Ts:%p\n' | sort -rn | cut -d: -f2)
IFS="$oifs"
shift
while [ -n "$1" ] ; do
echo file: "$1"
shift
done
[snap]

This will _not_ work in a pure POSIX shell like dash, as it only
permits 10 positional parameters; those shells will indeed have to
used a while loop fed by find(1) (like I noted above).

Of course, this breaks with newline characters in filenames, but
newlines are really uncommon (probably on left on a system by users
who don't want their files to be deleted. :-)).

And in zsh, you would actually do:
[snip]
for i in **/*(om) ; do foobar $i ; done
[snap]

Yes, zsh does recursive globbing and lets you define the sorting of
the generated file list.

Its really a pity that find(1) does not allow sorting by itself (and
if it was only by a handful of criteria).

But we are slowly leaving the topic, here. I just wanted to make sure
that beginners are not confronted with problematic for-loop constructs
like in the first mail I was replying to. Manipulating $IFS is
probably not something to confront beginners with either.

Hey, is there any command for taking a filename and escaping/encoding
shell-special characters to make a string that, when parsed by the
shell, specifies that filename? I'm thinking of something that would
work like this:

for i in `encode_for_shell *` ; ...
[...]

No, that is not how shells work.
Just to repeat this once and for all:
_Never_ do 'for i in `ls *`'. Never. It's broken.

some people just do not know how to handle them properly.)

You might not be, but it sounds like you're blaming users. Sometimes
it's developers of tools (including designers of formats) that don't
have an escape mechanism to handle spaces or other special characters
(or don't provide support for encoding special characters) who are to
blame.

Well, the shell is really really old. It has its flaws. That is why it
is not that easy to use and understand for beginners. Especially, if
they are taught how to do things wrong, that often. I admit that it
can be quite difficult to do things right[tm]. I'm making mistakes
when scripting in 'sh' all the time (at least if the script is a
little more than trivial).

[...]
Some people use things like this instead:
[snip]
ls * | while read file ; do whatever_command "$file" ; done
[snap]
This is just a little better than the for loop. It still breaks in
some situations.

I see how it would break with a newline character in a file name.
What other cases break?

Broken aliases.
Too long argument lists. Yeah, 'ls | while ...' does not have the
argument problem, but as soon as you start globbing, it's there.

There is _no_ reason why 'ls' should ever be used to generate file
lists for loops of any kind.

What about things that ls does that the shell's expansion of wildcards
does not do (e.g., sorting by date or size)?

(Maybe ls should have an equilavent to find's "-print0" option.)

In these cases, you use find(1) (in conjunction with other standard
tools, like sort, cut etc.).


Please note, that what I am writing here are no must-dos, of course.
I do not intend to attack anybody. I mean, there are people who know
POSIX shell scripting far better than I do, so who am I to judge
others? But 'for i in `ls *`' is really annoyingly wrong, even in my
eyes. :-)

So, sometimes, when you are writing one-liners, at the shell-prompt,
and you know, what data you are dealing with, you can do whatever
works the quickest. But if you are writing real scripts, that are
supposed to work (with data, you potentially don't know in the first
place), you will need to do things in a proper and robust way.

Sorry for the lengthy mail. I hope I could make myself a little
clearer and didn't spread buggy code. :-)

Regards, Frank

--
In protocol design, perfection has been reached not when there is
nothing left to add, but when there is nothing left to take away.
-- RFC 1925


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



Relevant Pages

  • Re: PHP [win32] & CLI Testers needed.
    ... written, lets just say to create shell emulation for now, and have ... My previous attempts to create the source below, ended with the loop ... finally resolve any issues due to this problem in PHP compatibility. ... Debug Warning: testforLinuxUser.php line 24 - stream_select: supplied argument is not a valid stream resource ...
    (php.general)
  • [Test Needed] PHP [win32] & CLI required for testing.
    ... written, lets just say to create shell emulation for now, and have ... My previous attempts to create the source below, ended with the loop ... stopping due to undefined varible, that requested for user input, the ... finally resolve any issues due to this problem in PHP compatibility. ...
    (alt.php)
  • PHP [win32] & CLI Testers needed.
    ... written, lets just say to create shell emulation for now, and have ... My previous attempts to create the source below, ended with the loop ... stopping due to undefined varible, that requested for user input, the ... finally resolve any issues due to this problem in PHP compatibility. ...
    (php.general)
  • Re: SHELLing
    ... batch file which first copies a file to the c:\temp directory and my VB ... loop until the file exists in c:\temp and then loop until ... >> user using a form and then SHELL out to run a compression program ... >> Kansas City Regional Office ...
    (microsoft.public.vb.general.discussion)
  • Re: Unix-ify File Names
    ... special characters do make it more difficult to handle arbitrary file ... is there any command for taking a filename and escaping/encoding ... (mapping each argument to a shell string for the argument's value) ... that propagate 'for i `ls *foobar*`' loops. ...
    (Debian-User)