Finding all files based on pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding all files based on pattern
# 1  
Old 12-19-2014
Finding all files based on pattern

Hi All,

I need to find all files in a directory which are containing specific pattern. Thing is that file name should not consider if pattern is only in commented area.

all contents which are under /* */ are commented
all lines which are starting with -- or if -- is a part of some sentence then all words after -- are commented in that line

for eg. I need to find all files which are containing insurance_no

file1-- this file should qualify for our search
Code:
where insurance_no=TGT.insurance_no 
-- insurance_no is unique no.

select * from 
table t1,t2
where t1.name=t2.name
and t.asset>2000 --and insurance_no <> "2521"

/* based on insuranace_no cutomer full
details can be find out*/

file2-- this file should not be qualified for our search as insurance_no is in commented area only

Code:
-- insurance_no is unique no.

select * from 
table t1,t2
where t1.name=t2.name
and t.asset>2000 --and insurance_no <> "2521"

/* based on insuranace_no cutomer full
details can be find out*/

the commands that i have tried so far is below but it not working as it considers those files also (in this case file2 also)which are containing only commented inurance_no also

Code:
find . -name "*.*" -exec grep -l "insurance_no" {} \; 2>/dev/null

find . -name "*.*" |xargs -n1 -I {} sh -c 'grep insurance_no {}|grep -v ".*--.*insurance_no.*"|grep -v ".*/\*.*insurance_no.*\*/" '

Thanks in advance for all your guidance /help.
# 2  
Old 12-19-2014
Hello Lakshman Gupta,

We can look for string where insurance_no and as I can see example only file1 has that not file2, so we can look for that string.
Following may help you to find these kind of files, let me know if you have any queries please.
Code:
find -type f -exec grep "where insurance_no" {} \; -print 2>/dev/null

Output will be as follows.
Code:
where insurance_no=TGT.insurance_no
./search_file1

Thanks,
R. Singh
# 3  
Old 12-19-2014
Hi Ravi,

Thanks for your time this "where" can change and it replaced with some other characters as we have to scan through 1000 of files. So this will not be genearlised solution

Last edited by Lakshman_Gupta; 12-19-2014 at 02:46 AM.. Reason: it was not clear
# 4  
Old 12-19-2014
Hello Lakshman_Gupta,

Could you please try following and let us know if this helps.
Code:
find -type f -exec grep '[^a-zA-Z0-9]insurance_no=[^a-zA-Z0-9]*'  {} \; -print 2>/dev/null

Output is as follows.
Code:
where insurance_no=TGT.insurance_no
./search_file1

EDIT: Even I have made a file as follows and above command is working fine for that too.
Code:
cat ./search_file3
908where insurance_no=90TGT.insurance_no
-- insurance_no is unique no.
select * from
table t1,t2
where t1.name=t2.name
and t.asset>2000 --and insurance_no <> "2521"
/* based on insuranace_no cutomer full
details can be find out*/

After running the command we will get following results.
Code:
find -type f -exec grep '[^a-zA-Z0-9]insurance_no=[a-zA-Z0-9]*'  {} \; -print 2>/dev/null
908where insurance_no=90TGT.insurance_no
./search_file3
where insurance_no=TGT.insurance_no

Thanks,
R. Singh

Last edited by RavinderSingh13; 12-19-2014 at 03:34 AM.. Reason: Added a note to solution
# 5  
Old 12-19-2014
You haven't said much about your definition of "pattern".

Are you performing case sensitive matches?

Can the pattern match any text, or does the pattern have to match entire "words"? If you're limiting it to words, what defines a word boundary?

Will your patterns ever contain any characters that are special in a BRE or ERE?

Will your patterns ever contain any characters that are special in a filename matching pattern?

Will your patterns ever contain any whitespace characters? (If so, does the pattern need to be matched if the pattern extends across line boundaries?)

Do you just need to process all of the regular files in a single directory? Or do you need to process all of the regular files in a file hierarchy rooted in a directory?

Do you just want the names of files that contain the (uncommented) pattern for which you're searching? Or, do you want the filename and the lines that contain the pattern? If you want the lines containing the pattern; do you want entire lines or can it just be lines with the comments discarded?
# 6  
Old 12-19-2014
Centos 6 / bash
This seems to work also.
Code:
find -type f -exec egrep -l '^\w.*[^- ]insurance_no.*$' '{}' \; 
./file3
./file1

Edit:
corrected spelling... didn't change anything here.
...based on insuranace_no cutomer

Last edited by ongoto; 12-19-2014 at 08:03 AM.. Reason: corrected misspellings in sample files
# 7  
Old 12-19-2014
Thanks Don !!!

For your questions here is response

Are you performing case sensitive matches? Yes

Can the pattern match any text, or does the pattern have to match entire "words"? If you're limiting it to words, what defines a word boundary? YES it can match any pattern like it can either insurance_no or a.insurance_no= or <>insurance_no or b.insurance_no or ,insurance_no,

Will your patterns ever contain any characters that are special in a BRE or ERE? NO

Will your patterns ever contain any characters that are special in a filename matching pattern? NO

Will your patterns ever contain any whitespace characters? (If so, does the pattern need to be matched if the pattern extends across line boundaries?) NO whitespace

Do you just need to process all of the regular files in a single directory? Or do you need to process all of the regular files in a file hierarchy rooted in a directory? /file hierarchy rooted in a directory

Do you just want the names of files that contain the (uncommented) pattern for which you're searching? Or, do you want the filename and the lines that contain the pattern? If you want the lines containing the pattern; do you want entire lines or can it just be lines with the comments discarded? Looking out for the name of files which are containing the matched pattern(Uncommented one)


Ravi,
I am analyzing your solution by giving more testcases thanks for your time

I just changed the content of test_2.txt as below and its name should be returned now but its not returning.

Code:
-- insurance_no is unique no.

select *,insurance_no from
table t1,t2
where t1.name=t2.name
and t.asset>2000 --and insurance_no <> "2521"

/* based on insuranace_no cutomer full
details can be find out*/

Meanwhile i was trying below

Code:
find . -name "*.*" |xargs  -n1 -I {} sh -c 'a=`grep insurance_no {}|grep -v ".*--.*insurance_no.*"|grep -v ".*/\*.*insurance_no.*\*/"`;if [ -z "$a" ] ;then echo "1">/dev/null ; else echo {} ;fi'

but its reaching xargs limits(error below) for file name which are smaller its working fine..

Code:
xargs: Maximum argument size with insertion via {}'s exceeded


Last edited by Lakshman_Gupta; 12-19-2014 at 07:14 AM.. Reason: to add more details
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Finding the same pattern in three consecutive lines in several files in a directory

I know how to search for a pattern/regular expression in many files that I have in a directory. For example, by doing this: grep -Ril "News/U.S." . I can find which files contain the pattern "News/U.S." in a directory. I am unable to accomplish about how to extend this code so that it can... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

2. Shell Programming and Scripting

Finding log files that match number pattern

I have logs files which are generated each day depending on how many processes are running. Some days it could spin up 30 processes. Other days it could spin up 50. The log files all have the same pattern with the number being the different factor. e.g. LOG_FILE_1.log LOG_FILE_2.log etc etc ... (2 Replies)
Discussion started by: atelford
2 Replies

3. Shell Programming and Scripting

Finding 4 current files having specific File Name pattern

Hi All, I am trying to find 4 latest files inside one folder having following File Name pattern and store them into 4 different variables and then use for processing in my shell script. File name is fixed length. 1) Each file starts with = ABCJmdmfbsjop letters + 7 Digit Number... (6 Replies)
Discussion started by: lancesunny
6 Replies

4. Shell Programming and Scripting

Finding/replacing strings in some files based on a file

Hi, We have a file (e.g. a .csv file, but could be any other format), with 2 columns: the old value and the new value. We need to modify all the files within the current directory (including subdirectories), so find and replace the contents found in the first column within the file, with the... (9 Replies)
Discussion started by: Talkabout
9 Replies

5. Shell Programming and Scripting

finding the files based on date..

Hi to every one , i had ascenario like this.. i had path like export/home/pmutv/test/ in this i will recive 43 files daily with each file having that days date i.e like product.sh.20110512 like this i will 43 files every day i had to find the files. if files are avaliable i... (2 Replies)
Discussion started by: apple2685
2 Replies

6. UNIX for Dummies Questions & Answers

finding and moving files based on the last three numerical characters in the filename

Hi, I have a series of files (upwards of 500) the filename format is as follows CC10-1234P1999.WGS84.p190, all in one directory. Now the last three numeric characters, in this case 999, can be anything from 001 to 999. I need to move some of them to a seperate directory, the ones I need to... (5 Replies)
Discussion started by: roche.j.mike
5 Replies

7. UNIX for Dummies Questions & Answers

finding all files that do not match a certain pattern

I hope I'm asking this the right way -- I've been sending out a lot of resumes and some of them I saw on Craigslist -- so I named the file as 'Craigslist -- (filename)'. Well I noticed that at least one of the files was misspelled as 'Craigslit.' I want to eventually try to write a shell... (5 Replies)
Discussion started by: Straitsfan
5 Replies

8. Shell Programming and Scripting

Finding conserved pattern in different files

Hi power user, For examples, I have three different files: file 1: file2: file 3: AAA CCC ZZZ BBB BBB CCC CCC DDD DDD DDD TTT AAA EEE AAA XXX I... (8 Replies)
Discussion started by: anjas
8 Replies

9. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

10. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies
Login or Register to Ask a Question