How to form a correct syntax to sift out according to complementary patterns with 'find'?


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How to form a correct syntax to sift out according to complementary patterns with 'find'?
# 8  
Old 01-18-2018
Hi,

Taking these in turn:

1. grep -l basically just prints the names of files that match the pattern you're searching for, rather than printing the matching lines in the files themselves. For example, compare:

Code:
$ cat test.txt
This is a test file.
This is the only line that contains the string 'FOO'.
This line doesn't contain it, and is the last line in the file.
$ grep FOO test.txt
This is the only line that contains the string 'FOO'.
$ grep -l FOO test.txt
test.txt
$

So we can see that when we used the -l flag, we just got the filename returned, rather than the matching line within the file.

2. grep -L prints out only the names of those files which do not match the given string. Again, best demonstrated with an example.

Code:
$ cat test.txt
This is a test file.
This is the only line that contains the string 'FOO'.
But because of that, a 'grep -L FOO' won't return the name of this file.
$ cat test2.txt
This file does not contain the string we're searching for.
So its filename will be printed when we do a grep -L
$ grep -L FOO *.txt
test2.txt
$

So here, we only got test2.txt in our output and not test.txt, since test2 did not contain the string we were searching for, whereas test did. Because we wanted to only see the names of those files which did not contain our string, this makes sense.

3. I don't have time right now for a full write-up of how -prune behaves, but basically it tells find not to consider everything that the arguments before the -prune lfag found, more or less. If this isn't clear then I'll try to reply again tonight with a bit more detail on this last point.

Hope this helps.

-

Right, a bit more detail on -prune. As others have mentioned throughout previous replies, in its usual usage find performs a series of tests to ultimately return whatever it is you've asked it to find. By default, all of these tests must pass, and only things which pass all of the tests you've specified will be acted upon by find in the end.

So in the case of the command find . -type f -exec grep -l FOO \{\} \;, there are a few things being tested here.

Firstly, that the thing being considered resides underneath the current working directory, represented by . (the first argument is always the root of the path that find will start from).

Second, we are only interested in things which are files. The -type flag can be used to search for directories, files, symbolic links, device files, all kinds of things. Files are represented by -type f

Now before we go on, remember that in order to proceed to the next test, the previous test must have passed. So at this point we've found all files that reside in or somewhere beneath our current directory. So the third step is continued into.

The third flag is a bit different, or might seem so at first. The purpose of -exec is to execute an external command on whatever it is we've found up 'til now. So here, we execute the command grep -l FOO on all files that reside in or beneath the current working directory. The item currently being processed is represented by two curly brackets, and the end of the command is signified by a semi-colon. So the \{\} is substituted at execution time by whichever of our found-so-far-things is currently up for consideration. And the ; signifies the termination of the command to be executed.

Now, that explains things so far. But what if we first want to exclude a category of things from consideration that would otherwise normally be caught by find ? That's where -prune comes in. It will remove from subsequent consideration anything that has been matched by any flags or actions taken up until this point in the find command. So things matched prior to the -prune flag will not be matched by anything that follows the -prune.

So if we look again at my original proposed solution to your problem, the command find . -type d -name 'browser?' -prune -o -type f -exec grep -l FOO \{\} \; can be broken down as follows:

1. The path we are running our find beneath. This is the current working directory, .

2. -type d -name 'browser?', meaning "all directories whose names match the regular expression 'browser?'".

3. -prune -o, meaning "we want to exclude from consideration all things matched by whatever comes next, if they also match whatever came before this point". So no content within directories in or underneath the current working directory whose names match 'browser?' will be affected by whatever follows this point.

4. -type f -exec grep -l FOO \{\} \;, meaning "execute grep -l FOO on all files". But because of our previous -prune -o, the end result of this is only to execute the grep -l FOO on all files that do not reside in or underneath a directory whose name matches the regex 'browser?'.

I hope this helps clear things up. If you have any more questions let me know and I'll be happy to help if I can.

Last edited by drysdalk; 01-18-2018 at 04:30 PM..
This User Gave Thanks to drysdalk For This Post:
# 9  
Old 01-21-2018
Apple

Huge thanks, your explanatory skills are impressive indeed. I went online only now after several days of intensive brainwork I tried to go through on my own and was only astonished that we share the similar understanding.
I think I may nailed the essence of my problem of why I was unable to filter out unneeded entries. After failing numerous tests trying to search just for the entries I wanted to be skipped instead of trying to actually I stumbled upon the fact that I had missed the logic when using
Code:
'.*(Safari|[Oo]pera).*

in that it output no results when I tried to feed it as an argument to
Code:
-and -not -path

parameter in the combination with
Code:
'.*(keyword|KEYWORD).*

. I bypassed an obvious thing which was that 1) expressions of the type
Code:
'.*(Safari|[Oo]pera).*

can NOT be arguments to
Code:
-path

,
Code:
-name

and 2) nor it coexist with any other parameters than
Code:
-regex

because it was what the name of the parameter's implies - "a regular extension of the Extended Set" - it could NOT be used to do what I sought not in conjunction with it.
So the correct logic of this part of the command line was
Code:
-regex  '.*(keyword|KEYWORD).* -and -not -regex '.*(Safari|[Oo]pera).*

. Use of -path and -regex on the same line with Extended Set regexes was like comparing incomparable. By further investigation I discovered that I could not make much of use of
Code:
-exec grep

or pipe into either
Code:
-grep

or
Code:
-xargs

, cause grepping is useful mostly for manipulating strings in target files or in output of such commands as
Code:
ls

so I dropped that option.

I looked closely at description of
Code:
-prune

once more and it was this phrase that caught my attention and made into my enlightenment:
Quote:
It causes find to not descend into the current file
.

So this is what I needed: to omit the entire pathname and get only the highest level for every matched result according to the pattern. It meant I had append this option to the command line without any following constructs. Having tested with simpler instances I glanced through the man page and threw in
Code:
-x

to omit constant "/dev/fd 3: not a directory" lines.

So, to sum the entire line that I struggled to come up with to do the task had to be:

Code:
echo PASSWORD | sudo -S find -E -x / -regex '.*(keyword|KEYWORD).*' -and -not -regex '.*(Safari|[Oo]pera).*' -and -not -path *OtherAlwaysShowingUpUselessLine* -prune

Notice that the argument to -path is NOT a regular expression of the extended set with which you'd use -E option to
Code:
find

necessary to provide if
Code:
-regex

is used too. In this case it conforms to the logic nicely and assists in the required manner.

That way I was able to reduce the output to only 5 lines, "sandboxing" that app in the search results that I then had opportunity to apply further actions to.

Last edited by scrutinizerix; 01-24-2018 at 09:14 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. OS X (Apple)

Can't figure out the correct syntax for a command loading a webkit plugin

Hello, Using Bash on Mac OS X 10.7.5 (Lion). I downloaded a GrowlSafari plugin for Webkit from its GitHub page GitHub - uasi/growl-safari-bridge: GrowlSafariBridge enables arbitrary javascript (including Safari Extensions) to notify via Growl.. In the description it says that after installing for... (0 Replies)
Discussion started by: scrutinizerix
0 Replies

2. Shell Programming and Scripting

Bash - Find files excluding file patterns and subfolder patterns

Hello. For a given folder, I want to select any files find $PATH1 -f \( -name "*" but omit any files like pattern name ! -iname "*.jpg" ! -iname "*.xsession*" ..... \) and also omit any subfolder like pattern name -type d \( -name "/etc/gconf/gconf.*" -o -name "*cache*" -o -name "*Cache*" -o... (2 Replies)
Discussion started by: jcdole
2 Replies

3. Shell Programming and Scripting

Cannot find correct syntax to make file name uppercase letters

I have a file name : var=UsrAccChgRpt I want to make them upper case. Tried: $var | tr Error: tr: Invalid combination of options and Strings. Usage: tr | -ds | -s | -ds | -s ] String1 String2 tr { -d | -s | -d | -s } String1 Could you please help. I am using AIX... (2 Replies)
Discussion started by: digioleg54
2 Replies

4. Shell Programming and Scripting

Find matched patterns and print them with other patterns not the whole line

Hi, I am trying to extract some patterns from a line. The input file is space delimited and i could not use column to get value after "IN" or "OUT" patterns as there could be multiple white spaces before the next digits that i need to print in the output file . I need to print 3 patterns in a... (3 Replies)
Discussion started by: redse171
3 Replies

5. Shell Programming and Scripting

how to form Records[multiple line] between two known patterns

file contents looks like this : #START line1 of record1 line2 of record1 #END #START line1 of record2 line2 of record2 line3 of record2 #END #START line1 of record3 #END my question how should i make it a records between #START and #END . willl i be able to get the contents of the... (5 Replies)
Discussion started by: sathish92
5 Replies

6. Shell Programming and Scripting

Do syntax is correct ?

I tried with sed command to create a space between namespace from the XML file. I used this syntax. Can someone tell me is this syntax is vaild? /usr/xpg4/bin/sed -e 's/<\/^.*><^.:Errort>/<\/^.*> <^.:Errort>/g' test > test2 I dint find any changes or any space being created between... (10 Replies)
Discussion started by: raghunsi
10 Replies

7. UNIX Desktop Questions & Answers

Correct syntax

Hi, I want to check if file(s) exist even in subdirectories and perform an action. After searching here couldn't find solution that would work, but made my own solution that works fine: if then echo egrep "$1|$2|$3" `find| grep MLOG` else echo "MLOG does not exist" fiThat will check... (1 Reply)
Discussion started by: Vitoriung
1 Replies

8. Shell Programming and Scripting

if [ $NOWDATE -gt $STARTDATE ] , date comparison correct syntax?

i've looked at a bunch of the date comparison threads on these boards but unfortunately not been able to figure this thing out yet. still confused by some of the way conditionals handle variables... here is what i where i am now... # a bunch of initializition steps are here ...... (1 Reply)
Discussion started by: danpaluska
1 Replies

9. Shell Programming and Scripting

Plz correct my syntax of shell script

Dear all I am still bit new in shell script area.I am writing down a shell script which I guess somewhere wrong so please kindly correct it. I would be greatful for that. What I actually want from this shell script is that it will move all the files one by one to another server which can be... (2 Replies)
Discussion started by: girish.batra
2 Replies

10. Shell Programming and Scripting

Correct Syntax For Calling Shell Script in Perl Module

Problem: I have a shell script that will be called by a Perl module that will connect to a db and delete rows. The Perl module will be called by CRON. I am using a Perl module to call a shell script because I need to get the db connection from Perl. Here is the Perl pseudocode: ... (4 Replies)
Discussion started by: mh53j_fe
4 Replies
Login or Register to Ask a Question