Comparing two files in UNIX and create a new file similar to equi join


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing two files in UNIX and create a new file similar to equi join
# 1  
Old 11-08-2014
Comparing two files in UNIX and create a new file similar to equi join

I have 2 files namely branch.txt file & RXD.txt file as below

Ex:Branch.txt
=========================
Code:
   B1,Branchname1,city,country
   B2,Branchname2,city,country
   B3,Branchname3,city,country
   B4,Branchname4,city,country
   B5,Branchname5,city,country

RXD file : will contain old branch code, old accountid, new branch code and new Accountid.
========

RXD.txt
===============
Code:
   B1,B1A1,WB1,WB1A1
   B1,B1A2,WB1,WB1A2
   B2,B2A1,WB2,WB2A1
   B2,B2A2,WB2,WB2A2
   B3,B3A1,WB3,WB3A1
   B1,B3A2,WB3,WB3A2

We need to compare old branch codes in the abv 2 files and create a new newBranchfilter.txt with new branch code (Similar to oracle equi join)

newBranchfilter.txt
=================
Code:
   WB1,Branchname1,city,country
   WB2,Branchname2,city,country
   WB3,Branchname3,city,country

Im new to Unix shell scripting..can anyone help me to solve this using unix shell script.
Thank you!


Moderator's Comments:
Mod Comment Please use code tags next time for your code and data. Thanks

Last edited by vbe; 11-08-2014 at 08:53 AM..
# 2  
Old 11-08-2014
You will find many examples in these forums for this kind of problem. In this case, you could try:
Code:
awk -F, 'NR==FNR{A[$1]=$3; next} $1 in A{$1=A[$1]}1' FS=, OFS=, file2 file1

which would produce:

Code:
WB1,Branchname1,city,country
WB2,Branchname2,city,country
WB3,Branchname3,city,country
B4,Branchname4,city,country
B5,Branchname5,city,country

or :
Code:
awk -F, 'NR==FNR{A[$1]=$3; next} $1 in A{$1=A[$1]; print}' FS=, OFS=, file2 file1

which would leave out the non-matched records:
Code:
WB1,Branchname1,city,country
WB2,Branchname2,city,country
WB3,Branchname3,city,country



--- Edit ---
There appears to be an error in the second sample file. I presume it should be:
Code:
B1,B1A1,WB1,WB1A1
B1,B1A2,WB1,WB1A2
B2,B2A1,WB2,WB2A1
B2,B2A2,WB2,WB2A2
B3,B3A1,WB3,WB3A1
B3,B3A2,WB3,WB3A2


Last edited by Scrutinizer; 11-08-2014 at 08:01 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 11-19-2014
Hi Scrutinizer,
Code:
awk -F, 'NR==FNR{A[$1]=$3; next} $1 in A{$1=A[$1]; print}' FS=, OFS=, file2 file1

is working perfectly. Can you please explain the command and its working?

Thanks,
satece

Last edited by Franklin52; 11-19-2014 at 12:16 PM.. Reason: Please use code tags
# 4  
Old 11-19-2014
Please use code tags as required by forum rules!

Code:
awk -F,                         # run awk and set field separator char
'                               # quote the entire script
NR==FNR {A[$1]=$3; next}        # for the first file, i.e. the file's line no. equals the overall line no., keep field 3 in array A indexed by field 1, stop processing this line, jump to next line
 $1 in A {$1=A[$1];             #  if field 1 points to an entry in A, replace it by that A entry's  contents
          print}                # and print this line
'                               # quote the entire script
FS=,                            # set field sep. again (pointless)
OFS=,                           # set output field sep.
file2                           #read and process file 2 first
file1                           # process file2

This User Gave Thanks to RudiC For This Post:
# 5  
Old 02-05-2015
Hi,

Similar to the above post, we are filtering the lines based on the 2 columns.

Input file1

Code:
a1,b1,1
a2,b2,2
a3,b4,3
a4,b3,4

Input file2
Code:
5,a1,b1
6,a2,b2
7,a3,b3
8,a4,b4

The requirement is to filter the records from the file1 when col1 and col2 of file1 matches with col2 and col3 of file 2 respectively.

when tried with the following awk command:

Code:
awk -F',' -v OFS=',' 'NR==FNR{a[$2FS$3]=$1;next} $1FS$2 in a{print $0} file2 file1

we got the output:
Code:
a1,b1,1
a2,b2,2
a3,b4,3

But the expected output is:
Code:
a1,b1,1
a2,b2,2

Please guide.

Last edited by Scrutinizer; 02-05-2015 at 07:26 AM.. Reason: code tags
# 6  
Old 02-05-2015
Code:
[akshay@localhost tmp]$ cat f1
a1,b1,1
a2,b2,2
a3,b4,3
a4,b3,4

Code:
[akshay@localhost tmp]$ cat f2
5,a1,b1
6,a2,b2
7,a3,b3
8,a4,b4

Code:
$ awk -F, 'FNR==NR{a[$2,$3];next}($1,$2) in a'  f2 f1
a1,b1,1
a2,b2,2


OR


Code:
[akshay@localhost tmp]$ awk -F, 'FNR==NR{a[$1,$2]=$0;next}($2,$3) in a{print a[$2,$3]}'   f1 f2
a1,b1,1
a2,b2,2

--edit--

Code:
awk -F, '
          # Here we read file1 (f1)
	  # FNR == NR only while reading first file

	  FNR==NR{
		   # We are interested in comparing
		   # Column1 and Column2 of file1 with
		   # Column2 and Column3 of file2, So
		   # here we create index Column1,Column2 ($1,$2)
                   # and its saved in array "a" where array value
		   # will be record of file1

		   a[$1,$2]=$0

		   # The next statement forces awk to 
		   # immediately stop processing the current record and go on to the next record

		   next
		 }

         # Here we read file2 (f2)
	 # again here we create index using column2 and column3 ($2,$3) of file2
	 # if index exists in array "a" we created above
	 # then print array value

     ($2,$3) in a{
		   print a[$2,$3]
		 }
       '   f1 f2

-F, -> Field separator is set to comma

Last edited by Akshay Hegde; 02-05-2015 at 06:01 AM..
# 7  
Old 02-05-2015
Hi Akshay,

Thanks for the quick help. But facing the same problem.
Getting the output as:
Code:
a1,b1,1
a2,b2,2
a3,b4,3

Please suggest.

Last edited by Scrutinizer; 02-05-2015 at 07:27 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to compare two files in UNIX using similar to vlookup?

Hi, I want to compare same column in two files, if values match then display the column or display "NA". Ex : File 1 : 123 abc xyz pqr File 2: 122 aab fdf pqr fff qqq rrr (1 Reply)
Discussion started by: hkoshekay
1 Replies

2. Shell Programming and Scripting

Alignment tool to join text files in 2 directories to create a parallel corpus

I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is .english and in the Hindi one the tag is .Hindi The file may contain either a single text or more than one text... (7 Replies)
Discussion started by: gimley
7 Replies

3. Shell Programming and Scripting

How to join one file with multiple files in a directory in UNIX?

Dear folks Hello I have a one file called (file1) which the structure looks like this 1 gi|358484521|ref|NW_003764373.1| 1 gi|358484520|ref|NW_003764374.1| 1 gi|358484519|ref|NW_003764375.1| . . . 30 gi|368484519|ref|NW_00449375.1| In addition, I have around 300... (19 Replies)
Discussion started by: sajmar
19 Replies

4. Shell Programming and Scripting

Comparing Select Columns from two CSV files in UNIX and create a third file based on comparision

Hi , I want to compare first 3 columns of File A and File B and create a new file File C which will have all rows from File B and will include rows that are present in File A and not in File B based on First 3 column comparison. Thanks in advance for your help. File A A,B,C,45,46... (2 Replies)
Discussion started by: ady_koolz
2 Replies

5. Shell Programming and Scripting

Join all the lines matching similar pattern

I am trying to Join all the lines matching similar pattern. Example ; I wanted to join all the lines which has sam to a single line. In next line, i wanted to have all the lines with jones to a single line....etc > cat sample.txt sam 2012/11/23 sam 2012/12/5 sam 2012/12/5 jones... (2 Replies)
Discussion started by: evrurs
2 Replies

6. Shell Programming and Scripting

Create SQL DML insert statements from file using AWK or similar

Hi all. This is my first post on this forum. I've previously found great help in the huge knowledgebase that is here, but this time I have not been able to find a solution to my problem. I have a large text file that looks like this: typedef struct ABC_struct_nbr1_ { char attr1; /*... (0 Replies)
Discussion started by: Yagi Uda
0 Replies

7. Shell Programming and Scripting

Help in unix script to join similar lines of input

Hi, I have been thinking of how to script this but i have no clue at all.. Could someone please help me out or give me some idea on this? I would like to group those lines with the same first variable in each line, joining the 2nd variables with commas. Let's say i have the following input. ... (3 Replies)
Discussion started by: rei125
3 Replies

8. Shell Programming and Scripting

Comparing similar columns in two different files

Hi, I have two text files.The first and the 2nd file have data in the same format For e.g. The first file has table_name1 column1 sum(column1) max(column1) min(column1) table_name1 column2 sum(column2) max(column2) min(column2) table_name1 coulmn3 sum(column3) max(column3) min(column3) ... (13 Replies)
Discussion started by: ragavhere
13 Replies

9. Shell Programming and Scripting

Find duplicate value comparing 2 files and create an output

I need a perl script which will create an output file after comparing two diff file in a directory path: /export/home/abc/file1 /export/home/abc/file2 File Format: <IP>TAB<DeviceName><TAB>DESCRIPTIONS file1: 10.1.2.1.3<tab>abc123def<tab>xyz.mm1.ppp.... (2 Replies)
Discussion started by: ricky007
2 Replies

10. UNIX for Dummies Questions & Answers

How to join flat files together under unix

Hi, I have 7 big flat files, each contains 24 million records which have been sorted by the first field delimitered by Ctrl B (002). I want to join them together side by side, eg. File A: 1^Ba^Bb 2^Bx^By .... File B: 1^Bc^Bd 2^Bm^Bn .... After merged, it should look like :... (7 Replies)
Discussion started by: xli
7 Replies
Login or Register to Ask a Question