worked *perfectly* for anything I put in my "exclude" file. That's the solution. Bingo.
But why? What does that "F" (that I wasn't originally using) do??? "Fixed strings"? That's a handy option.
Using fixed strings (-F option) instead of basic regular expressions (default) or extended regular expression (-E option) is faster when there aren't any characters that are special in a regular expression; but in this case (with or without -F), the results should be the same except for how long it takes to complete as long as the exclude file and all files being processed are proper text files. Either one of your files isn't a text file, there is a bug in the version of grep you're using, or some hidden characters in your exclude file are affecting regular expression parsing. It would still be interesting to see the output of:
Code:
od -bc exclude
for a version of exclude that causes:
Code:
grep -vf exclude log.txt > out
to go into never-never land.
This User Gave Thanks to Don Cragun For This Post:
od -bc exclude
0000000 062 060 066 012 063 060 064 012 064 060 063 012 064 060 064 012
2 0 6 \n 3 0 4 \n 4 0 3 \n 4 0 4 \n
0000020 064 060 065 012 065 060 060 012 151 156 166 151 164 145 012 155
4 0 5 \n 5 0 0 \n i n v i t e \n m
0000040 160 063 012 144 141 156 012 163 157 146 151 141
p 3 \n d a n \n s o f i a
0000054
and
Code:
tail -5 log.txt|od -bc
0000000 061 070 060 056 067 066 056 065 056 062 060 040 055 040 055 040
1 8 0 . 7 6 . 5 . 2 0 - -
0000020 133 063 060 057 116 157 166 057 062 060 061 063 072 060 064 072
[ 3 0 / N o v / 2 0 1 3 : 0 4 :
0000040 061 066 072 062 071 040 055 060 066 060 060 135 040 042 107 105
1 6 : 2 9 - 0 6 0 0 ] " G E
0000060 124 040 057 176 146 151 163 157 057 164 145 154 145 143 157 156
T / ~ f i s o / t e l e c o n
0000100 057 102 141 147 144 151 147 151 141 156 055 103 141 162 162 141
/ B a g d i g i a n - C a r r a
0000120 163 161 165 151 154 154 157 137 065 055 062 062 055 061 063 170
s q u i l l o _ 5 - 2 2 - 1 3 x
0000140 057 040 110 124 124 120 057 061 056 061 042 040 064 060 063 040
/ H T T P / 1 . 1 " 4 0 3
0000160 062 065 066 012 064 056 062 066 056 061 063 062 056 067 060 040
2 5 6 \n 4 . 2 6 . 1 3 2 . 7 0
0000200 055 040 055 040 133 063 060 057 116 157 166 057 062 060 061 063
- - [ 3 0 / N o v / 2 0 1 3
0000220 072 060 064 072 061 070 072 062 066 040 055 060 066 060 060 135
: 0 4 : 1 8 : 2 6 - 0 6 0 0 ]
0000240 040 042 120 117 123 124 040 057 045 067 060 045 066 070 045 067
" P O S T / % 7 0 % 6 8 % 7
0000260 060 045 067 060 045 066 061 045 067 064 045 066 070 057 045 067
0 % 7 0 % 6 1 % 7 4 % 6 8 / % 7
0000300 060 045 066 070 045 067 060 077 045 062 104 045 066 064 053 045
0 % 6 8 % 7 0 ? % 2 D % 6 4 + %
0000320 066 061 045 066 103 045 066 103 045 066 106 045 067 067 045 065
6 1 % 6 C % 6 C % 6 F % 7 7 % 5
0000340 106 045 067 065 045 067 062 045 066 103 045 065 106 045 066 071
F % 7 5 % 7 2 % 6 C % 5 F % 6 9
0000360 045 066 105 045 066 063 045 066 103 045 067 065 045 066 064 045
% 6 E % 6 3 % 6 C % 7 5 % 6 4 %
0000400 066 065 045 063 104 045 066 106 045 066 105 053 045 062 104 045
6 5 % 3 D % 6 F % 6 E + % 2 D %
0000420 066 064 053 045 067 063 045 066 061 045 066 066 045 066 065 045
6 4 + % 7 3 % 6 1 % 6 6 % 6 5 %
0000440 065 106 045 066 104 045 066 106 045 066 064 045 066 065 045 063
5 F % 6 D % 6 F % 6 4 % 6 5 % 3
0000460 104 045 066 106 045 066 066 045 066 066 053 045 062 104 045 066
D % 6 F % 6 6 % 6 6 + % 2 D % 6
0000500 064 053 045 067 063 045 067 065 045 066 070 045 066 106 045 067
4 + % 7 3 % 7 5 % 6 8 % 6 F % 7
0000520 063 045 066 071 045 066 105 045 062 105 045 067 063 045 066 071
3 % 6 9 % 6 E % 2 E % 7 3 % 6 9
0000540 045 066 104 045 067 065 045 066 103 045 066 061 045 067 064 045
% 6 D % 7 5 % 6 C % 6 1 % 7 4 %
0000560 066 071 045 066 106 045 066 105 045 063 104 045 066 106 045 066
6 9 % 6 F % 6 E % 3 D % 6 F % 6
0000600 105 053 045 062 104 045 066 064 053 045 066 064 045 066 071 045
E + % 2 D % 6 4 + % 6 4 % 6 9 %
0000620 067 063 045 066 061 045 066 062 045 066 103 045 066 065 045 065
7 3 % 6 1 % 6 2 % 6 C % 6 5 % 5
0000640 106 045 066 066 045 067 065 045 066 105 045 066 063 045 067 064
F % 6 6 % 7 5 % 6 E % 6 3 % 7 4
0000660 045 066 071 045 066 106 045 066 105 045 067 063 045 063 104 045
% 6 9 % 6 F % 6 E % 7 3 % 3 D %
0000700 062 062 045 062 062 053 045 062 104 045 066 064 053 045 066 106
2 2 % 2 2 + % 2 D % 6 4 + % 6 F
0000720 045 067 060 045 066 065 045 066 105 045 065 106 045 066 062 045
% 7 0 % 6 5 % 6 E % 5 F % 6 2 %
0000740 066 061 045 067 063 045 066 065 045 066 064 045 066 071 045 067
6 1 % 7 3 % 6 5 % 6 4 % 6 9 % 7
0000760 062 045 063 104 045 066 105 045 066 106 045 066 105 045 066 065
2 % 3 D % 6 E % 6 F % 6 E % 6 5
0001000 053 045 062 104 045 066 064 053 045 066 061 045 067 065 045 067
+ % 2 D % 6 4 + % 6 1 % 7 5 % 7
0001020 064 045 066 106 045 065 106 045 067 060 045 067 062 045 066 065
4 % 6 F % 5 F % 7 0 % 7 2 % 6 5
0001040 045 067 060 045 066 065 045 066 105 045 066 064 045 065 106 045
% 7 0 % 6 5 % 6 E % 6 4 % 5 F %
0001060 066 066 045 066 071 045 066 103 045 066 065 045 063 104 045 067
6 6 % 6 9 % 6 C % 6 5 % 3 D % 7
0001100 060 045 066 070 045 067 060 045 063 101 045 062 106 045 062 106
0 % 6 8 % 7 0 % 3 A % 2 F % 2 F
0001120 045 066 071 045 066 105 045 067 060 045 067 065 045 067 064 053
% 6 9 % 6 E % 7 0 % 7 5 % 7 4 +
0001140 045 062 104 045 066 105 040 110 124 124 120 057 061 056 061 042
% 2 D % 6 E H T T P / 1 . 1 "
0001160 040 064 060 064 040 062 061 067 012 062 061 062 056 064 060 056
4 0 4 2 1 7 \n 2 1 2 . 4 0 .
0001200 061 063 066 056 062 065 040 055 040 055 040 133 063 060 057 116
1 3 6 . 2 5 - - [ 3 0 / N
0001220 157 166 057 062 060 061 063 072 060 064 072 062 067 072 064 071
o v / 2 0 1 3 : 0 4 : 2 7 : 4 9
0001240 040 055 060 066 060 060 135 040 042 107 105 124 040 057 176 146
- 0 6 0 0 ] " G E T / ~ f
0001260 151 163 157 057 164 145 154 145 143 157 156 057 103 157 156 144
i s o / t e l e c o n / C o n d
0001300 157 156 137 071 055 061 070 055 061 063 057 040 110 124 124 120
o n _ 9 - 1 8 - 1 3 / H T T P
0001320 057 061 056 061 042 040 062 060 060 040 066 060 066 071 012 070
/ 1 . 1 " 2 0 0 6 0 6 9 \n 8
0001340 065 056 063 061 056 062 061 071 056 066 064 040 055 040 055 040
5 . 3 1 . 2 1 9 . 6 4 - -
0001360 133 063 060 057 116 157 166 057 062 060 061 063 072 060 064 072
[ 3 0 / N o v / 2 0 1 3 : 0 4 :
0001400 062 071 072 062 061 040 055 060 066 060 060 135 040 042 107 105
2 9 : 2 1 - 0 6 0 0 ] " G E
0001420 124 040 057 176 144 141 156 057 114 145 163 164 145 162 137 111
T / ~ d a n / L e s t e r _ I
0001440 123 104 103 062 060 061 063 056 160 160 164 040 110 124 124 120
S D C 2 0 1 3 . p p t H T T P
0001460 057 061 056 061 042 040 062 060 060 040 064 070 064 065 065 066
/ 1 . 1 " 2 0 0 4 8 4 5 5 6
0001500 070 012 066 066 056 062 064 071 056 067 063 056 061 060 070 040
8 \n 6 6 . 2 4 9 . 7 3 . 1 0 8
0001520 055 040 055 040 133 063 060 057 116 157 166 057 062 060 061 063
- - [ 3 0 / N o v / 2 0 1 3
0001540 072 060 064 072 062 071 072 064 061 040 055 060 066 060 060 135
: 0 4 : 2 9 : 4 1 - 0 6 0 0 ]
0001560 040 042 107 105 124 040 057 176 146 151 163 157 057 164 145 154
" G E T / ~ f i s o / t e l
0001600 145 143 157 156 057 124 150 162 157 156 163 157 156 055 113 165
e c o n / T h r o n s o n - K u
0001620 164 164 145 162 137 061 062 055 061 065 055 061 060 057 040 110
t t e r _ 1 2 - 1 5 - 1 0 / H
0001640 124 124 120 057 061 056 061 042 040 062 060 060 040 071 067 061
T T P / 1 . 1 " 2 0 0 9 7 1
0001660 012
\n
0001661
That's for an "exclude" file that sends grep -vf into never-never land, but one that works with grep -Fvf.
Also, FWIW, I wrote those "exclude" files both with the Mac text editor *and* with vi. No difference. So I can't see why there might be hidden characters.
---------- Post updated at 05:54 PM ---------- Previous update was at 05:44 PM ----------
Quote:
Originally Posted by alister
With -F, exclude contains literal strings. Without it, regular expressions.
I have to assume that was what was special about the alphabetic (as opposed to numeric) characters in the "exclude" file. With grep- vf, grep wanted to interpret those alphabetic character strings as expressions. The numbers were obviously not expressions, so it didn't get confused with those.
That's what seemed so screwy. grep -vf could exclude numbers successfully, but not alphabetic strings. Now, in my book, both are alphanumerics, so if one worked, the other should as well.
od -bc exclude
0000000 062 060 066 012 063 060 064 012 064 060 063 012 064 060 064 012
2 0 6 \n 3 0 4 \n 4 0 3 \n 4 0 4 \n
0000020 064 060 065 012 065 060 060 012 151 156 166 151 164 145 012 155
4 0 5 \n 5 0 0 \n i n v i t e \n m
0000040 160 063 012 144 141 156 012 163 157 146 151 141
p 3 \n d a n \n s o f i a
0000054
... ... ...
That's for an "exclude" file that sends grep -vf into never-never land, but one that works with grep -Fvf.
Also, FWIW, I wrote those "exclude" files both with the Mac text editor *and* with vi. No difference. So I can't see why there might be hidden characters.
Wrong. There is a HUGE difference.
You didn't write the above exclude file with vi. It is not a text file. (It doesn't have a <newline> character at the end of the last line.) The behavior of grep is undefined when the input files you give it are not text files.
If you add the missing <newline> character to that file, I would be very surprised if grep -vf doesn't start working as you expected. There are several ways to fix that file. Among them are:
Code:
echo >> exclude
but don't do that if there is a <newline> at the end of the file or it will add an empty line to the end of the file; and
Code:
vi exclude
which will probably include a note in the status line when you open it something like:
Code:
"exclude" [noeol] 10L, 44C
where the noeol means that there is no end-of-line marker at the end of the last line. Using the vi command :wq will add the missing <newline> character to the buffer, rewrite the file with the missing <newline>, and quit. You can repeat this as many times as you want and it won't add empty lines to the end of the file.
Note that the standards don't require vi to work when given a file that is not a text file either; but the vi (or vim) on OS X will do what you need in this case.
"The behavior of grep is undefined when the input files you give it are not text files."
So, um, here's a dumb question -- how is it that a file produced by Mac "TextEdit" is not a "text file"? But indeed, if the filename thus produced doesn't have a .txt on the end, it doesn't seem to have a <newline> at the end. In fact, if I open such a file with vi, it says at the bottom "[noeol]", like you said it would. I save it with vi, and from then on, "[noeol]" isn't reported. vi inserts that <newline> when it saves it and makes it into a real live text file, I guess. I can also just change the file name from "exclude" to "exclude.txt", and the OS sticks a <newline> on, it seems. Wow.
So a real "text file" has to have a <newline> character at the end, and Mac TextEdit doesn't put it there, if you don't specify a .txt suffix. I never knew that. I naively thought that, well, text is text.
Now, having done that, grep -vf still doesn't work on that file, once it has a <newline> on it.
"The behavior of grep is undefined when the input files you give it are not text files."
So, um, here's a dumb question -- how is it that a file produced by Mac "TextEdit" is not a "text file"? But indeed, if the filename thus produced doesn't have a .txt on the end, it doesn't seem to have a <newline> at the end. In fact, if I open such a file with vi, it says at the bottom "[noeol]", like you said it would. I save it with vi, and from then on, "[noeol]" isn't reported. vi inserts that <newline> when it saves it and makes it into a real live text file, I guess. I can also just change the file name from "exclude" to "exclude.txt", and the OS sticks a <newline> on, it seems. Wow.
So a real "text file" has to have a <newline> character at the end, and Mac TextEdit doesn't put it there, if you don't specify a .txt suffix. I never knew that. I naively thought that, well, text is text.
Now, having done that, grep -vf still doesn't work on that file, once it has a <newline> on it.
The Mac OS X TextEdit application processes several file formats that are text files and several file formats that are not text files. If the name of a file opened (or created) by TextEdit ends with ".txt", it will treat it as a text file; if it ends with ".rft", it will treat it as a rich text file; if it ends with ".doc", it will handle some of the text formatting done by Microsoft Word (and note that most Microsoft Word files ARE NOT text files). If there is no filename extension on the file, the preferences you have set in TextEdit will determine how it treats that file.
If you have a file (say xxx) that is not a text file and you rename the file xxx.txt, that doesn't change the format or contents of the file. (Although TextEdit might try to turn it into a text file if you use it to edit that file after you rename it.) Most UNIX utilities that take a filename as an operand could care less what the name of the file is. The filename extensions like .txt, .sh, .mp3, .rtf, et cetera provide a useful convention to help humans (and a few applications) make good guesses about what should be inside that file.
If you have turned exclude into a real text file and:
Code:
grep -vf exclude log.txt
still goes to never-never land, I would assume that (even though the filename ends in .txt and has a <newline> at the end of the file) it is not a text file as defined by the standards. The most likely problems would be that one or more "lines" in log.txt are longer than LINE_MAX (2048 on recent Mac OS X systems) bytes or it contains one or more null bytes (i.e., a byte with all bits set to 0).
Yes, that is, of course, very true about different kinds of text files. But it is fascinating that a <newline> character, which isn't displayed in even vi, determines whether grep thinks of the file as truly text. That octal dump command is handy in that regard, as I suppose is the status line in vi.
And yes, that's right about just changing the suffix. I thought that worked, but it really doesn't. My mistake.
That's interesting speculation about why a file that looks like a real text file isn't doing the job with grep -vf. But at least with the F option, it's all fine and I can make it work. So I'll think about that and, in the meantime, let me thank everyone here for their very prompt and helpful comments.
Dear Ladies & Gents,
I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out:
for filename in $(find /var/log/test... (2 Replies)
Hello.
Following recommendations for one of my threads, this is working perfectly :
#!/bin/bash
CNT=$( grep -c -e "some text 1" -e "some text 2" -e "some text 3" "/tmp/log_file.txt" )
Now I need a grep success for some thing like :
#!/bin/bash
CNT=$( grep -c -e "some text_1... (4 Replies)
Hi,
I have line in input file as below:
3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL
My expected output for line in the file must be :
"1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL"
Can someone... (7 Replies)
Hi all,
can any one suggest me the script to grep multiple strings from ps -ef
pls correct the below script . its not working/
i want to print OK if all the below process are running in my solaris system. else i want to print NOT OK.
bash-3.00$ ps -ef | grep blu
lscpusr 48 42 ... (11 Replies)
Hi All,
I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations:
1. I am restrained to 2 input files only.
2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
AIX 4.2
I am trying to do an rsh grep to search for date records inside server logs by doing this :
xx=`date +"%a %b %d"`
rsh xxx grep "^$XX" zzz
gives :
grep: 0652-033 Cannot open Jun.
grep: 0652-033 Cannot open 11.
But if I do :
xx=`date +"%a %b %d"`
grep "^$XX" zzz
it works... (2 Replies)
Hi Team,
I am new to this forum and also trying to learn Unix.
I will highly appriciate your help if you can help me to get the right command .
{{{
I use the command " today | egrep '(10:| 11: )' | grep ERROR " to grep all the files that has been error betweeen 10 to 11... (6 Replies)
Hi,
I don't know hot to make this command work:
ls -laR | grep "^-" | awk '{print $9}'| grep "$.txt"
It should return the list of file .txt
It's important to search .txt at the end of the line, becouse some file name have "txt" in their name but have other extensions (13 Replies)
Hi Friends,
Can any of you explain me about the below line of code?
mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`
Im not able to understand, what exactly it is doing :confused:
Any help would be useful for me.
Lokesha (4 Replies)