Comparing similar columns in two different files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing similar columns in two different files
# 1  
Old 04-11-2008
Network Comparing similar columns in two different files

Hi,

I have two text files.The first and the 2nd file have data in the same format
For e.g. The first file has
table_name1 column1 sum(column1) max(column1) min(column1)
table_name1 column2 sum(column2) max(column2) min(column2)
table_name1 coulmn3 sum(column3) max(column3) min(column3)

table_name2 column1 sum(column1) max(column1) min(column1)
table_name2 column2 sum(column2) max(column2) min(column2)
.
.

The 2nd file has the same data but few additional columns might be present in it.

I need to compare the values of both the files for common columns and report the mismatches to a separate file.
I am new to unix. Can anyone tell me how to do this?
# 2  
Old 04-11-2008
Assuming your columns are separated by a single space (or fixed width), you can use cut to extract the ones which should be in both files, and compare.

Code:
cut -d ' ' -f1-5 file2 | diff -u file1 - | less

You probably don't want diff -u output as the end result, but that should get you going.
# 3  
Old 04-11-2008
Network Comparing similar columns in two different files

Hi,

I dont understand ur reply. I can probably give you the relevant content of the file.

File1 contains

BOOKS COUNT: 40
BOOKS AUTHOR1 SUM:1018 MAX:47 MIN:1 AVG:25.45
BOOKS AUTHOR3 SUM:181 MAX:48 MIN:3 AVG:18.1

Note:Read it as Table columnname sum(column) max(column) min(column)

File2 contains

BOOKS COUNT: 40
BOOKS AUTHOR1 SUM:1018 MAX:45 MIN:2 AVG:23.55
BOOKS AUTHOR2 SUM:561 MAX:49 MIN:1 AVG:21.5769231
BOOKS AUTHOR3 SUM:181 MAX:48 MIN:3 AVG:18.1


The file 2 contains the additional column named AUTHOR2.

Comparison should be such that the column AUTHOR1 in file 1 is compared with the column AUTHOR1 in file 2.

Here The coumn AUTHOR! in the table BOOKS mismatches with the other file.

I should write a script to compare the columns of the table as said and report the mismatches to a different file and also report the additioanl columns present in the 2nd file.

The output should be in this format.

Column AUTHOR1 in the tabloe BOOKS mismatches.
BOOKS AUTHOR1 SUM:1018 MAX:47 MIN:1 AVG:25.45
BOOKS AUTHOR1 SUM:1018 MAX:45 MIN:2 AVG:23.55

Column AUTHOR2 is additional.


Please help me in writing a script to do this job.

Thanks.Smilie
# 4  
Old 04-11-2008
Data Comparing similar columns in two different files

Hi,

File 1 and 2 contains the statistics for many other tables in the format mentioned before. And the column are separated by a single space.
# 5  
Old 04-11-2008
comm command

comm can be used to handle common lines between two files.

Code:
comm -3 File1 File2

Further help in
Code:
man comm

# 6  
Old 04-11-2008
Code:
awk 'NR == FNR { 
f1[$1,$2] = $0
next
}
($1 SUBSEP $2) in f1 && $0 != f1[$1,$2] {
  printf "Column %s in the table %s mismatches\n%s\n%s\n", 
  $2, $1, f1[$1,$2], $0
}
!(($1 SUBSEP $2) in f1) {
  printf "Columns %s is additional\n", $2 
}' file1 file2

Use nawk or /usr/xpg4/bin/awk on Solaris.
# 7  
Old 04-14-2008
Question Table name to be appended to the output

SmilieHi Radoulov,

Thanks for your help. The output of your code is almost what i expected.

The ouput i get for the additional columns present is in this format

column 'column_name' is additional.

But this doesn't tell from which table.What can be done to make my output look like this.

column 'column_name' from the table 'table_name' is additional.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing two files in UNIX and create a new file similar to equi join

I have 2 files namely branch.txt file & RXD.txt file as below Ex:Branch.txt ========================= B1,Branchname1,city,country B2,Branchname2,city,country B3,Branchname3,city,country B4,Branchname4,city,country B5,Branchname5,city,country RXD file : will... (11 Replies)
Discussion started by: satece
11 Replies

2. Shell Programming and Scripting

Merging two columns from two files with similar names into a loop

I have two files like this: fileA.net A B C fileA.dat 1 2 3 and I want the output output_expected A 1 B 2 C 3 I know that the easier way is to do a paste fileA.net fileA.dat, but the problem is that I have 10,000 couple of files (fileB.net with fileB.dat; fileC.net with... (3 Replies)
Discussion started by: valente
3 Replies

3. Shell Programming and Scripting

Computing the ratio of similar columns in the two files using awk script

Thanks Bartus11 for your help in the following code to compare the two files "t1" and "t2". awk 'NR==FNR{a=1;next}$2 in a{print $2}' t1 t2 First can anyone explain that what is the purpose of assigning a =1? Second, the current script is printing out the matched columns between the... (4 Replies)
Discussion started by: coder83
4 Replies

4. Shell Programming and Scripting

Comparing two columns from two different files

Hi, I have a single-column file1 having records like: 00AB01/11 43TG22/00 78RC09/34 ...... ...... and a second file , file 2 having two columns like 78RC09/34 1 45FD11/11 2 00AB01/11 3 43TG22/00 4 ...... ...... (8 Replies)
Discussion started by: amarn
8 Replies

5. UNIX for Dummies Questions & Answers

Merge two files with two columns being similar

Hi everyone. How can I merge two files, where each file has 2 columns and the first columns in both files are similar? I want all in a file of 4 columns; join command removes the duplicate columns. 1 Dave 2 Mark 3 Paul 1 Apple 2 Orange 3 Grapes to get it like this in the 3rd file:... (9 Replies)
Discussion started by: Atrisa
9 Replies

6. Shell Programming and Scripting

comparing two columns from two different files

Hello, I have two files as 1.txt and 2.txt with number as columns. 1.txt 0 53.7988 1 -30.0859 2 20.1632 3 14.2135 4 14.6366 5 -37.6258 . . . 31608 -8.57333 31609 -2.58554 31610 -24.2857 2.txt (1 Reply)
Discussion started by: AKD
1 Replies

7. Shell Programming and Scripting

comparing 2 columns from 2 files

Hey, I have 2 files that have a name and then a number: File 1: dog 21 dog 24 cat 33 cat 27 dog 76 cat 65 File 2: dog 109 dog 248 cat 323 cat 207 cat 66 (2 Replies)
Discussion started by: dcfargo
2 Replies

8. Shell Programming and Scripting

comparing the columns in two files

I have two files file1 and file 2 both are having multiple coloumns.i want to select only two columns. i used following code to get the desired columns,with ',' as delimiter cut -d ',' -f 1,2 file1 | sort > file1.new cut -d ',' -f 1,2 file2 | sort > file2.new I want to get the coloums... (1 Reply)
Discussion started by: bab123
1 Replies

9. Shell Programming and Scripting

Comparing Columns of two FIles

Dear all, I have two files in UNIX File1 and File2 as below: File1: 1,1234,.,67.897,,0 1,4134,.,87.97,,4 0,1564,.,97.8,,1 File2: 2,8798,.,67.897,,0 2,8879,.,77.97,,4 0,1564,.,97.8,,1 I want to do the following: (1) Make sure that both the files have equal number of columns and if... (4 Replies)
Discussion started by: ggopal
4 Replies

10. UNIX for Advanced & Expert Users

Comparing Columns of two FIles

Dear all, I have two files in UNIX File1 and File2 as below: File1: 1,1234,.,67.897,,0 1,4134,.,87.97,,4 0,1564,.,97.8,,1 File2: 2,8798,.,67.897,,0 2,8879,.,77.97,,4 0,1564,.,97.8,,1 I want to do the following: (1) Make sure that both the files have equal number of columns and if... (1 Reply)
Discussion started by: ggopal
1 Replies
Login or Register to Ask a Question