Re: grep for metacharacters
- From: Doug Morse <morse@xxxxxxxxx>
- Date: Wed, 12 Sep 2007 16:52:44 +0000 (UTC)
hi tim,
well, it turns out i misspoke when i said i was able to match the tab
character -- i happened to have "t" in the line (and only the line) with the
tab character. thus, i also cannot seem to get grep or egrep to match \t, \f,
etc.
in running "info grep" and "info regrex" i can see no reference to support for
these escaped special characters. worse, i can see no reference to specifying
octal or hex codes for characters (\0011 or \x09), which might have been
another way to specify these special characters.
so, rather shockingly, it seems like Linux's grep/egrep DON'T support these
special escape sequences...
two workarounds come to mind:
if you want to stick with grep, use the -f option. i can put a single tab or
formfeed character in a file and match based on that:
grep -f tabchar.txt myfile.txt
grep -f ffchar.txt myfile.txt
not especially "neat", but it works, works in scripts, and avoids any issues
of having to further escape the pattern.
another option is to switch to a program that has the appropriate regex
support, such as perl:
perl -ne '/\011/ and print;' myfile.txt
matches tabs in myfile.txt.
hope this helps.
doug
On Wed, 12 Sep 2007 07:23:08 -0400, Tim Boyer <tim@xxxxxxxxxxxxxx> wrote:
On Wed, 12 Sep 2007 02:34:43 +0000 (UTC), Doug Morse <morse@xxxxxxxxx> wrote:.
very strange. have you tried quoting the regex:
grep -c "\t"
you probably know this, but just in case: if you're running this grep
from, say, a bash script, you might have to escape the escape
character:
grep -c \\t
and this can go on ad naseum, e.g., a subshell in a script might need:
`grep -c \\\\t`
also, i should point out that i didn't read your original post close
enough and that i should note a correction to it. "grep -c \t" will
count the number of LINES having a tab character, NOT the number of
tab characters (as you wrote).
lastly, i just ran "grep -c \t" and "egrep -c \t" on RHEL4 with a file
containing tabs and it worked just fine. seems like it's gotta be a
backslash escaping issue, or perhaps your file doesn't have any tabs?
It's one I constructed just for this - it's six lines of ^M, ^L, tabs, and
numbers. And egrep in RHEL5 doesn't see _any_ of them:
[root@dg printouts]# egrep -c \t testfile.txt
0
[root@dg printouts]# egrep -c \\t testfile.txt
0
[root@dg printouts]# egrep -c "\t" testfile.txt
0
Now, RH support gave me a way to do it - if I'm looking for, say, page breaks
in that file, type at the command line a ^v^L to insert a page break character
in my 'egrep'. And _that_ works:
[root@dg printouts]# egrep -c ^L testfile.txt
6
[root@dg printouts]# egrep -c ^M testfile.txt
12
But you can't do it in a script, and it won't work for, say, tabs or CRs.
- Follow-Ups:
- Re: grep for metacharacters
- From: Tim Boyer
- Re: grep for metacharacters
- References:
- grep for metacharacters
- From: Tim Boyer
- Re: grep for metacharacters
- From: Doug Morse
- Re: grep for metacharacters
- From: Tim Boyer
- Re: grep for metacharacters
- From: Doug Morse
- Re: grep for metacharacters
- From: Tim Boyer
- grep for metacharacters
- Prev by Date: Re: grep for metacharacters
- Next by Date: Re: Open LDAP Problem
- Previous by thread: Re: grep for metacharacters
- Next by thread: Re: grep for metacharacters
- Index(es):
Relevant Pages
|