How to check "item" is in multiple files?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to check "item" is in multiple files?
# 1  
Old 01-17-2013
How to check "item" is in multiple files?

I have multiple files:
Code:
file1:
one
two
three

file2:
two
four
six

file3:
one
two
ten

These are all in a directory called dir1

I also have multiple other files:
Code:
x_file1:
one
six
ten

x_file2:
three
eight
nine

x_file3:
seven
one
five

These are all in a directory called dir2

I want to run through each of the first set of files in dir1, and check if any of those items are in any of the other set of files. So in this case I would want my output to be:

Code:
output:
two
four

# 2  
Old 01-17-2013
Code:
cd dir1
awk ' {arr[$0]++; next} END {for(i in arr){print i}} ' file* > /tmp/summed.txt

cd another_dir
awk ' NR==FNR {arr[$0]++; next} {arr[$[0]++; next} 
         END{for(i in arr){if(arr[i]>1{print i}}}'  /tmp/summed.txt X_file* > output.txt

# 3  
Old 01-17-2013
I don't understand what you want. The item "one" appears in two files in dir1 and in two files in dir2, but it is not in your wanted output. The items "three" and "six" both appear in a file in dir1 and a file in dir2, but neither of them are in your wanted output. The item "four" only appears in one file in dir1 and doesn't appear in any file in dir2, but it is in your wanted output file. All of these cases seem to contradict what you said you want. Please try again to explain more clearly what you are trying to do.
# 4  
Old 01-18-2013
I have not tried the code given by Jim McNamara yet, but to clarify for don... Basically if ANY "item" from file1, file2, file3 appear in ANY of the files of dir2, then I do not care for those "items". Now if there is ANY "item" which occurs in dir1, and does NOT occur in ANY of the files of dir2, those are the items I want outputted, regardless of how many times it appears in dir1... I hope that makes sense now...
# 5  
Old 01-18-2013
Try this:
Code:
awk '
d==0 {  a1[$0]
        next
}
{       a2[$0]
}
END {   for(i in a1)
                if(!(i in a2))
                        print i
}' dir1/* d=2 dir2/*

As always, if you're running on a Solaris/Sun OS system, use /usr/xpg4/bin/awk or nawk instead of awk.

PS I believe that when you said
Quote:
check if any of those items are in any of the other set of files
in your original message in this thread, you meant
Quote:
check if any of those items are NOT in any of the other set of files
.
This User Gave Thanks to Don Cragun For This Post:
# 6  
Old 01-18-2013
Jim I got the following error message:
./junk: line 4: cd: dir2: No such file or directory .......(yes I do have a directory->dir2)
awk: NR==FNR {arr[$0]++; next} {arr[$[0]++; next}
awk: ^ syntax error
awk: fatal: invalid subscript expression


...............

Don, worked perfect, thanks!

---------- Post updated at 03:20 PM ---------- Previous update was at 03:20 PM ----------

sorry, this is the error message in case you were wondering Jim

Code:
./junk: line 4: cd: dir2: No such file or directory
awk:  NR==FNR {arr[$0]++; next} {arr[$[0]++; next}
awk:                                  ^ syntax error
awk: fatal: invalid subscript expression

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete all log files older than 10 day and whose first string of the first line is "MSH" or "<?xml"

Dear Ladies & Gents, I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out: for filename in $(find /var/log/test... (2 Replies)
Discussion started by: Hiroshi
2 Replies

2. UNIX for Advanced & Expert Users

"GET" command retrieves multiple files while using wildcard

Hi All I am using GNU/Linux This is regarding the get command to retrieve files (filename with wild card characters) from remote server. I thought Get command can retrieve only 1 file irrespective of the files it has on the remote server And it is the function of mget to retrieve all... (7 Replies)
Discussion started by: sparks
7 Replies

3. Shell Programming and Scripting

Find out if multiple files have lines ending with"r"

I am trying to find out which files in a group of files have lines ending in r. What I have is this: cat /tmp/*RECORDS| if grep r$>/dev/null; then echo "yes";else echo"no";fi Records is more than one file. There are the following files TEST-RECORDS /volume/testing /volume/programs ... (2 Replies)
Discussion started by: newbie2010
2 Replies

4. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

5. Solaris

How to check "faulty" or "stalled" print queues - SAP systems?

Hi all, First off, sorry for a long post but I think I have no other option if I need to explain properly what I need help for. I need some advise on how best to check for "faulty" or "stalled/jammed' print queues. At the moment, I have three (3) application servers which also acts as print... (0 Replies)
Discussion started by: newbie_01
0 Replies

6. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

7. Shell Programming and Scripting

Compiling multiple ".c" files starting with xxx

Hello, I am trying to figure out how I can write a bashscript that compiles several ".c" files that start with xxx (example: xxx_try.c and xxx_that.c) So I want to compile all these files with a bash script. Anyone can help pls? (6 Replies)
Discussion started by: Freak79
6 Replies

8. Shell Programming and Scripting

"sed" to check file size & echo " " to destination file

Hi, I've modified the syslogd source to include a thread that will keep track of a timer(or a timer thread). My intention is to check the file size of /var/log/messages in every one minute & if the size is more than 128KB, do a echo " " > /var/log/messages, so that the file size will be set... (7 Replies)
Discussion started by: jockey007
7 Replies

9. Shell Programming and Scripting

Awk - to test multiple files "read" permission ?

Hi Masters, Iam new to this Forum and this is my first post. My question is: I've some datafiles belongs the type (A, B, C) in the location 'export/home/lokiman ' dataA1.txt dataB28.txt dataC35.txt 1) I've to check the read permission for each file, if it not there then I've to... (1 Reply)
Discussion started by: lokiman
1 Replies

10. Shell Programming and Scripting

check input = "empty" and "numeric"

Hi how to check input is "empty" and "numeric" in ksh? e.g: ./myscript.ksh k output show: invalid number input ./myscript.ksh output show: no input ./myscript.ksh 10 output show: input is numeric (6 Replies)
Discussion started by: geoffry
6 Replies
Login or Register to Ask a Question