Sponsored Content
Top Forums Shell Programming and Scripting find common elements in 2 files (for loop) Post 302142629 by ibking on Sunday 28th of October 2007 09:17:15 PM
Old 10-28-2007
Bug find common elements in 2 files (for loop)

Hi,
i'm new here (and to scripting too).

I was hoping for some help in comparing two files.

i have a file called 'file1' with a list of names in the following format:
adam jones
paul higgins
kelly lowe


i also have another file which may contain some of the names but with a lot of other stuff, this is called 'file2'. it contains info such as:
aaa aaaa paul higgins aaaaa
aaaaa aaaaa aaaa aa aa aaa
aaaa aaaa kelly lowe aaaaa
asdsaad adsadsa asdas
assadasd
sa etc

(where aaaa is just random stuff html)

i want to find the common names in both files and produce an output in the form of:
0
1
1


this is with reference to the 'master' file (file1). ie. it has found in the 'file2' file that there are no adam jones, and 1 each of paul and kelly.

to do this task, i am using the following shell script however it does not seem to work:

#!/bin/sh

for i in file1

do

grep -c $i file2

done


it seems like quite a simple script, but i'm not sure why it does not work?(it only seems to grep for the first name, but does not loop through)

thanks for any helpSmilie
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies

2. Shell Programming and Scripting

Files common in two sets ??? How to find ??

Suppose we have 2 set of files set 1 set 2 ------ ------ abc hgb def ppp mgh vvv nmk sdf hgb ... (1 Reply)
Discussion started by: skyineyes
1 Replies

3. UNIX for Dummies Questions & Answers

how to find common words and take them out from two files

Hi, everyone, Let's say, we have xxx.txt A 1 2 3 4 5 C 1 2 3 4 5 E 1 2 3 4 5 yyy.txt A 1 2 3 4 5 B 1 2 3 4 5 C 1 2 3 4 5 D 1 2 3 4 5 E 1 2 3 4 5 First I match the first column I find intersection (A,C, E), then I want to take those lines with ACE out from yyy.txt, like A 1... (11 Replies)
Discussion started by: kaixinsjtu
11 Replies

4. Shell Programming and Scripting

Script to find NOT common strings in two files

Hi all, I'd like you to help or give any advise about the following: I have two (2) files, file1 and file2, both files have information common to each other. The contents of file1 is a subset of the contents of file2: file1: errormsgadmin esdp esgservices esignipa iprice ipvpn irm... (18 Replies)
Discussion started by: hnux
18 Replies

5. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. Shell Programming and Scripting

Find Common Values Across Two Files

Hi All, I have two files like below: File1 MYFILE_28012012_1112.txt|4 MYFILE_28012012_1113.txt|51 MYFILE_28012012_1114.txt|57 MYFILE_28012012_1115.txt|57 MYFILE_28012012_1116.txt|57 MYFILE_28012012_1117.txt|57 File2 MYFILE_28012012_1110.txt|57 MYFILE_28012012_1111.txt|57... (2 Replies)
Discussion started by: angshuman
2 Replies

7. UNIX for Dummies Questions & Answers

Merging tables: identifiying common and unique elements

Hi all, I know how to merge two tables and to remove the duplicated lines based on a field (Column 2) . My next challenge is to be able to identify in a new column those common elements between table A & B, those elements in table A not present in table B and vice versa. A simple count would be... (6 Replies)
Discussion started by: lsantome
6 Replies

8. Shell Programming and Scripting

Count common elements in a column

HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column. For instance: If my input was the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 My... (4 Replies)
Discussion started by: owwow14
4 Replies

9. Shell Programming and Scripting

Find common files between two directories

I have two directories Dir 1 /home/sid/release1 Dir 2 /home/sid/release2 I want to find the common files between the two directories Dir 1 files /home/sid/release1>ls -lrt total 16 -rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123 -rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies
DIFF(1) 						      General Commands Manual							   DIFF(1)

NAME
diff - differential file and directory comparator SYNOPSIS
diff [ -l ] [ -r ] [ -s ] [ -cefhn ] [ -biwt ] dir1 dir2 diff [ -cefhn ] [ -biwt ] file1 file2 diff [ -Dstring ] [ -biw ] file1 file2 DESCRIPTION
If both arguments are directories, diff sorts the contents of the directories by name, and then runs the regular file diff algorithm (described below) on text files which are different. Binary files which differ, common subdirectories, and files which appear in only one directory are listed. Options when comparing directories are: -l long output format; each text file diff is piped through pr(1) to paginate it, other differences are remembered and summarized after all text file differences are reported. -r causes application of diff recursively to common subdirectories encountered. -s causes diff to report files which are the same, which are otherwise not mentioned. -Sname starts a directory diff in the middle beginning with file name. When run on regular files, and when comparing text files which differ during directory comparison, diff tells what lines must be changed in the files to bring them into agreement. Except in rare circumstances, diff finds a smallest sufficient set of file differences. If nei- ther file1 nor file2 is a directory, then either may be given as `-', in which case the standard input is used. If file1 is a directory, then a file in that directory whose file-name is the same as the file-name of file2 is used (and vice versa). There are several options for output format; the default output format contains lines of these forms: n1 a n3,n4 n1,n2 d n3 n1,n2 c n3,n4 These lines resemble ed commands to convert file1 into file2. The numbers after the letters pertain to file2. In fact, by exchanging `a' for `d' and reading backward one may ascertain equally how to convert file2 into file1. As in ed, identical pairs where n1 = n2 or n3 = n4 are abbreviated as a single number. Following each of these lines come all the lines that are affected in the first file flagged by `<', then all the lines that are affected in the second file flagged by `>'. Except for -b, -w, -i or -t which may be given with any of the others, the following options are mutually exclusive: -e produces a script of a, c and d commands for the editor ed, which will recreate file2 from file1. In connection with -e, the fol- lowing shell program may help maintain multiple versions of a file. Only an ancestral file ($1) and a chain of version-to-version ed scripts ($2,$3,...) made by diff need be on hand. A `latest version' appears on the standard output. (shift; cat $*; echo '1,$p') | ed - $1 Extra commands are added to the output when comparing directories with -e, so that the result is a sh(1) script for converting text files which are common to the two directories from their state in dir1 to their state in dir2. -f produces a script similar to that of -e, not useful with ed, and in the opposite order. -n produces a script similar to that of -e, but in the opposite order and with a count of changed lines on each insert or delete com- mand. This is the form used by rcsdiff(1). -c produces a diff with lines of context. The default is to present 3 lines of context and may be changed, e.g to 10, by -c10. With -c the output format is modified slightly: the output beginning with identification of the files involved and their creation dates and then each change is separated by a line with a dozen *'s. The lines removed from file1 are marked with `- '; those added to file2 are marked `+ '. Lines which are changed from one file to the other are marked in both files with with `! '. Changes which lie within <context> lines of each other are grouped together on output. (This is a change from the previous ``diff -c'' but the resulting output is usually much easier to interpret.) -h does a fast, half-hearted job. It works only when changed stretches are short and well separated, but does work on files of unlimited length. -Dstring causes diff to create a merged version of file1 and file2 on the standard output, with C preprocessor controls included so that a compilation of the result without defining string is equivalent to compiling file1, while defining string will yield file2. -b causes trailing blanks (spaces and tabs) to be ignored, and other strings of blanks to compare equal. -w is similar to -b but causes whitespace (blanks and tabs) to be totally ignored. E.g., ``if ( a == b )'' will compare equal to ``if(a==b)''. -i ignores the case of letters. E.g., ``A'' will compare equal to ``a''. -t will expand tabs in output lines. Normal or -c output adds character(s) to the front of each line which may screw up the indenta- tion of the original source lines and make the output listing difficult to interpret. This option will preserve the original source's indentation. FILES
/tmp/d????? /usr/libexec/diffh for -h /bin/diff for directory diffs /bin/pr SEE ALSO
cmp(1), cc(1), comm(1), ed(1), diff3(1) DIAGNOSTICS
Exit status is 0 for no differences, 1 for some, 2 for trouble. BUGS
Editing scripts produced under the -e or -f option are naive about creating lines consisting of a single `.'. When comparing directories with the -b, -w or -i options specified, diff first compares the files ala cmp, and then decides to run the diff algorithm if they are not equal. This may cause a small amount of spurious output if the files then turn out to be identical because the only differences are insignificant blank string or case differences. 4th Berkeley Distribution October 21, 1996 DIFF(1)
All times are GMT -4. The time now is 08:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy