Compare multiple files and print unique lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare multiple files and print unique lines
# 1  
Old 02-03-2012
Compare multiple files and print unique lines

Hi friends,

I have multiple files. For now, let's say I have two of the following style

cat 1.txt
Quote:
A B C 1 2 3 4 5 6 7 8
D E F 9 8 7 6 5 4 3 2
G H I 0 1 2 3 4 5 6 7
cat 2.txt
Quote:
A B C -1 -2 3 4 5 6 -7 80
D E F 90 88 76 54 32 1 0 1
M N O 99 65 34 22 13 9 4 3

output.txt
Quote:
G H I 0 1 2 3 4 5 6 7 1.txt
M N O 99 65 34 22 13 9 4 3 2.txt
Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100 files to do this.

All helps are highly appreciated.

Thanks
# 2  
Old 02-03-2012
Code:
awk 'END {
  for (R in r) {
    split(r[R], t, SUBSEP)
    if (!t[1])
      print t[3], t[2]
    }
  }
{
  k = $1 SUBSEP $2 SUBSEP $3
  r[k] = c[k]++ SUBSEP FILENAME SUBSEP $0 
  }' [12].txt

This User Gave Thanks to radoulov For This Post:
# 3  
Old 02-03-2012
Quote:
Originally Posted by radoulov
Code:
awk 'END {
  for (R in r) {
    split(r[R], t, SUBSEP)
    if (!t[1])
      print t[3], t[2]
    }
  }
{
  k = $1 SUBSEP $2 SUBSEP $3
  r[k] = c[k]++ SUBSEP FILENAME SUBSEP $0 
  }' [12].txt


It works great, but will it compare the first three columns?
# 4  
Old 02-03-2012
Yes.
Do you want to compare the absolute numeric values of the rest of the columns?
This User Gave Thanks to radoulov For This Post:
# 5  
Old 02-03-2012
Quote:
Originally Posted by radoulov
Yes.
Do you want to compare the absolute numeric values of the rest of the columns?
Yes.

And also, instead of alphabets in the first three columns, I might have some numbers too.
# 6  
Old 02-03-2012
Please post bigger samples of the input files and, again, an example of the desired output based on that exact input.

For example, I don't understand why these two lines shouldn't be considered unique ...:

Code:
D E F 9 8 7 6 5 4 3 2 
D E F 90 88 76 54 32 1 0 1

# 7  
Old 02-03-2012
Quote:
Originally Posted by radoulov
Please post bigger samples of the input files and, again, an example of the desired output based on that exact input.

For example, I don't understand why these two lines shouldn't be considered unique ...:

Code:
D E F 9 8 7 6 5 4 3 2 
D E F 90 88 76 54 32 1 0 1

I thought of asking u if u needed data. Ok sorry anyways. here u go with the data.

1.txt
Code:
A 2 3 1 2 3 4 5 6 7 8
D 4 5 9 8 7 6 5 4 3 2 
G 5 6 0 1 2 3 4 5 6 7
K 7 8 1 32 33 45 67 98 76 34
I 7 8 I A M A N I N D
L 2 3 G O T O H E L L

2.txt
Code:
A 2 3 1 2 3 4 5 6 7 8
D 4 5 9 8 7 6 5 4 3 2 
G 5 6 0 1 2 3 4 5 6 7
D O L K I N H J K I L
J G H J L K M N J U I 
M A A T U J H E S A L

3.txt
Code:
A 2 3 1 2 3 4 5 6 7 8
D 4 5 9 8 7 6 5 4 3 2 
G 5 6 0 1 2 3 4 5 6 7

4.txt
Code:
A 2 3 -1 -2 3 4 5 6 -7 80
D 4 5 90 88 76 54 32 1 0 1
M N O 99 65 34 22 13 9 4 3

Output.txt
Code:
M N O 99 65 34 22 13 9 4 3 4.txt
K 7 8 1 32 33 45 67 98 76 34 1.txt
I 7 8 I A M A N I N D 1.txt
L 2 3 G O T O H E L L 1.txt
D O L K I N H J K I L 2.txt
J G H J L K M N J U I 2.txt
M A A T U J H E S A L 2.txt

I just need a match on the first three columns if it is present in more than one file it should be eliminated. Thanks for all ur help.

Moderator's Comments:
Mod Comment Please use code tags!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print number of lines for files in directory, also print number of unique lines

I have a directory of files, I can show the number of lines in each file and order them from lowest to highest with: wc -l *|sort 15263 Image.txt 16401 reference.txt 40459 richtexteditor.txt How can I also print the number of unique lines in each file? 15263 1401 Image.txt 16401... (15 Replies)
Discussion started by: spacegoose
15 Replies

2. Shell Programming and Scripting

Reading multiple values from multiple lines and columns and setting them to unique variables.

Hello, I would like to ask for help with csh script. An example of an input in .txt file is below, the number of lines varies from file to file and I have 2 or 3 columns with values. I would like to read all the values (probably one by one) and set them to independent unique variables that... (7 Replies)
Discussion started by: FMMOLA
7 Replies

3. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

4. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

5. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

6. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

7. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Hi. I have a tab separated file that has a couple nearly identical lines. When doing: sort file | uniq > file.new It passes through the nearly identical lines because, well, they still are unique. a) I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
Discussion started by: rocket_dog
1 Replies

8. Shell Programming and Scripting

AWK print lines into multiple files

Hi, i have an input text file like this: Student 1 maths science = Student 2 maths science = Student 3 maths science i would like to print each student information into separate files, each student id is separated by "=". (1 Reply)
Discussion started by: saint2006
1 Replies

9. Shell Programming and Scripting

Compare two files and print the two lines with difference

I have two files like this: #FILE 1 ABCD 4322 26485 JMTJ 5311 97248 XMPJ 4321 58978 #FILE 2 ABCD 4321 26485 JMTJ 5311 97248 XMPJ 4321 68978 What to do: Compare the two files and find those lines that doesn't match. And have a new file like this: #FILE 3 "from file 1" ABCD 4322 26485... (11 Replies)
Discussion started by: kingpeejay
11 Replies

10. Shell Programming and Scripting

awk to compare lines of two files and print output on screen

hey guys, I have two files both with two columns, I have already created an awk code to ignore certain lines (e.g lines that start with 963) as they wou ld begin with a certain string, however, the rest I have added together and calculated the average. At the moment the code also displays... (3 Replies)
Discussion started by: chlfc
3 Replies
Login or Register to Ask a Question