How to merge two tables based on a matched column?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to merge two tables based on a matched column?
# 1  
Old 09-17-2014
How to merge two tables based on a matched column?

Hi,

Please excuse me , i have searched unix forum, i am unable to find what i expect ,
my query is , i have 2 files of same structure and having 1 similar field/column , i need to merge 2 tables/files based on the one matched field/column (that is field 1),

file 1:
Code:
TABLE_NAME|PROD_COUNT|UAT_COUNT|DIFFERENCE
IBOXX_INDEX_WEIGHTS|21018|21018|0
EQUITY_INDEX_WEIGHTS|211909|211909|0
IGMR_FEED_NODE_MAP|38873|38873|0
BUS_HIERARCHY|105249|105249|0
SLICE_HIERARCHY|5942|5942|0
EQUITY_INDEX_WEIGHTS_ROLLUP|4246|4844|598
IBOXX_INDEX_WEIGHTS_ROLLUP|943|943|0
MRAD_CACHE|90571|90571|0
SPOT_RATES|218|218|0
POSITION_RATING|0|0|0
CDS_INDEX_JTD_WEIGHTS|0|841320|841320
CURVE_RECOVERY_RATE|1983|1983|0
FS_ENRICH|7634190|7634190|0
IBOXX_ENRICH|47957|47957|0

file 2:
Code:
TABLE_NAME|PROD_COUNT|UAT_COUNT|DIFFERENCE
IBOXX_INDEX_WEIGHTS|21018|21018|0
EQUITY_INDEX_WEIGHTS|213936|213936|0
IGMR_FEED_NODE_MAP|38166|38166|0
BUS_HIERARCHY|105230|105230|0
SLICE_HIERARCHY|5940|5940|0
EQUITY_INDEX_WEIGHTS_ROLLUP|4220|4629|409
IBOXX_INDEX_WEIGHTS_ROLLUP|943|943|0
MRAD_CACHE|0|90569|90569
SPOT_RATES|0|218|218
POSITION_RATING|0|0|0
CDS_INDEX_JTD_WEIGHTS|0|841320|841320
CURVE_RECOVERY_RATE|0|1983|1983
FS_ENRICH|7274101|7274101|0
IBOXX_ENRICH|0|47957|47957

expected output (in file2 , excluding field1, need to merge rest of field with file1):
Code:
TABLE_NAME|PROD_COUNT|UAT_COUNT|DIFFERENCE|PROD_COUNT|UAT_COUNT|DIFFERENCE
IBOXX_INDEX_WEIGHTS|21018|21018|0|21018|21018|0
EQUITY_INDEX_WEIGHTS|211909|211909|0|213936|213936|0
IGMR_FEED_NODE_MAP|38873|38873|0|38166|38166|0
BUS_HIERARCHY|105249|105249|0|105230|105230|0
SLICE_HIERARCHY|5942|5942|0|5940|5940|0
EQUITY_INDEX_WEIGHTS_ROLLUP|4246|4844|598|4220|4629|409
IBOXX_INDEX_WEIGHTS_ROLLUP|943|943|0|943|943|0
MRAD_CACHE|90571|90571|0|0|90569|90569
SPOT_RATES|218|218|0|0|218|218
POSITION_RATING|0|0|0|0|0|0
CDS_INDEX_JTD_WEIGHTS|0|841320|841320|0|841320|841320
CURVE_RECOVERY_RATE|1983|1983|0|0|1983|1983
FS_ENRICH|7634190|7634190|0|7274101|7274101|0
IBOXX_ENRICH|47957|47957|0|0|47957|47957


tried like this
Code:
awk 'FNR==NR{A[NR]=$1;next}{print A[FNR]"$1"$1}' file2 file1

but first column is also appearing.

Please advise and give a gist of awk command.

Thanks,
Regards,
karthikram
# 2  
Old 09-17-2014
Hello,

Kindly use the following code for same.

Code:
awk -F"|" 'NR==FNR{a[$1]=$0;next} ($1 in a){b=$1;$1="";print a[b]  $0}' OFS="|" file1 file2

Output will be as follows.

Code:
TABLE_NAME|PROD_COUNT|UAT_COUNT|DIFFERENCE|PROD_COUNT|UAT_COUNT|DIFFERENCE
IBOXX_INDEX_WEIGHTS|21018|21018|0|21018|21018|0
EQUITY_INDEX_WEIGHTS|211909|211909|0|213936|213936|0
IGMR_FEED_NODE_MAP|38873|38873|0|38166|38166|0
BUS_HIERARCHY|105249|105249|0|105230|105230|0
SLICE_HIERARCHY|5942|5942|0|5940|5940|0
EQUITY_INDEX_WEIGHTS_ROLLUP|4246|4844|598|4220|4629|409
IBOXX_INDEX_WEIGHTS_ROLLUP|943|943|0|943|943|0
MRAD_CACHE|90571|90571|0|0|90569|90569
SPOT_RATES|218|218|0|0|218|218
POSITION_RATING|0|0|0|0|0|0
CDS_INDEX_JTD_WEIGHTS|0|841320|841320|0|841320|841320
CURVE_RECOVERY_RATE|1983|1983|0|0|1983|1983
FS_ENRICH|7634190|7634190|0|7274101|7274101|0
IBOXX_ENRICH|47957|47957|0|0|47957|47957


Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 09-17-2014
You didn't specify the field separator:-
Code:
awk -F\| '
        NR == FNR {
                T = $0
                sub(/[^|]*/, X, T)
                A[$1] = T
                next
        }
        $1 in A {
                print $0 A[$1]
        }
' file2 file1

This User Gave Thanks to Yoda For This Post:
# 4  
Old 09-17-2014
Hi RavinderSingh/Yoda,

Both are working great , it is simple but i am unable to figure out.

@RavinderSingh
Code:
awk -F"|" 'NR==FNR{a[$1]=$0;next} ($1 in a){b=$1;$1="";print a[b]  $0}' OFS="|" file1 file2

could you Please explain about this awk command.

Thanks,
Regards,
karthikram
# 5  
Old 09-17-2014
Hello,

Following may help you.

Code:
awk -F"|" '
FNR==NR        ##### This condition will be true only when file1 will be read #####
{a[$1]=$0;     ##### Here I am making array a whose index is $1 and value is $0(line) ####
next           ##### Stop processing the current input record ####
} 
($1 in a)      ##### Now checking $1's avalibility in array a as we have defined $1 as index of array a ####
{b=$1;         ##### assigning $1's value to variable b ####
$1="";         ##### making $1 to NULL as it is not required as per request ####
print a[b]  $0 ##### printing matching index array's value with file2's current line ####
}' OFS="|" file1 file2

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-17-2014 at 01:14 PM..
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Old 09-17-2014
Of course it is possible to use awk-scripts, sed-scripts, perl-scripts and probably another array of programmable text filters for this.

You might be interested to know that Unix offers a genuine tool for exactly your purpose: join. join takes two files as source and generates an output line (which format you can control) for each matching pair of lines.

Here is an example (taken from the join man page of AIX): suppose you have two files, "names" and "phone"

names:

Code:
Adams A.        555-6235
Dickerson B.    555-1842
Erwin G.        555-1234
Jackson J.      555-0256
Lewis B.        555-3237
Norwood M.      555-5341
Smartt D.       555-1540
Wright M.       555-1234
Xandy G.        555-5015

phone:

Code:
Erwin           Dept.  389
Frost           Dept.  217
Nicholson       Dept.  311
Norwood         Dept.  454
Wright          Dept.  520
Xandy           Dept.  999

The command

Code:
join  phone  names

would produce the output:

Code:
# join phone names
Erwin  G.        555-1234        Dept.  389
Norwood  M.      555-5341        Dept.  454
Wright  M.       555-1234        Dept.  520
Xandy  G.        555-5015        Dept.  999

Each line consists of the join field (per default the first field in the line, which is the last name) followed by the rest of the line found in the phone file, then the rest of the line in the names file.

There are options to control the process: display unmatched lines, declare different key values, separators, etc. Notice that join works on sorted files, therefore sorting one or both files before you use them influences output.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find matched patterns in a column of 2 files with different size and merge them

Hi, i have input files like below:- input1 Name Seq_ID NewID Scores MT1 A0QZX3 1.65 277.4 IVO A0QZX3 1.65 244.5 HPO A0QZX3 1.65 240.5 RgP A0Q3PP 5.32 241.0 GX1 LPSZ3S 96.1 216.9 MEL LPSS3X 4.23 204.1 LDD LPSS3X 4.23 100.2 input2 Fac AddName NewID ... (9 Replies)
Discussion started by: redse171
9 Replies

2. Shell Programming and Scripting

Merge files based on the column value

Hi Friends, I have a file file1.txt 1|ABC|3|jul|dhj 2|NHU|4|kil|eu 3|hjd|34|hfd|43 file2.txt 1||3|KING|dhj 2|NHU||k| 3|hjd|34|hd|43 i want to merge file1.txt file2.txt based on the column null values in file2.txif there are any nulls in column values , (5 Replies)
Discussion started by: i150371485
5 Replies

3. Shell Programming and Scripting

Merge multiple tables into big matrix

Hi all, I have a complex (beyond my biological expertise) problem at hand. I need to merge multiple files into 1 big matrix. Please help me with some code. Inp1 Ang_0 chr1 98 T A Ang_0 chr1 352 G A Ang_0 chr1 425 C T Ang_0 chr2 ... (1 Reply)
Discussion started by: newbie83
1 Replies

4. UNIX for Dummies Questions & Answers

Cut from tables based on column values

Hello, I have a tab-delimited table that may contain 11,12 or 13 columns. Depending on the number of columns, I want to cut and get a sub table as shown below. However, the awk commands in the code seem to be an issue. What should I be doing differently? #cut columns 1-2,4-5,11 when 12 &... (3 Replies)
Discussion started by: Gussifinknottle
3 Replies

5. Shell Programming and Scripting

merge multiple tables with perl

Hi everyone, I once again got stuck with merging tables and was wondering if someone could help me out on that problem. I have a number of tab delimited tables which I need to merge into one big one. All tables have the same header but a different number of rows (this could be changed if... (6 Replies)
Discussion started by: TuAd
6 Replies

6. Shell Programming and Scripting

Help with merge two file based on similar column content

Input file 1: A1BG A1BG A1BG A1CF A1CF BCAS BCAS A2LD1 A2M A2M HAT . . Input file 2: A1BG All A1CF TEMP (5 Replies)
Discussion started by: perl_beginner
5 Replies

7. Shell Programming and Scripting

Merge Two Tables with duplicates in first table

Hi.. File 1: 1 aa rep 1 dd rep 1 kk rep 2 bb sad 2 ss sad 3 ee dam File 2 1 apple fruit 2 mango tree 3 lilly flower output: 1 aaple fruit aa,dd,kk rep (7 Replies)
Discussion started by: empyrean
7 Replies

8. Shell Programming and Scripting

merge two two txt files into one file based on one column

Hi, I have file1.txt and file2.txt and would like to create file3.txt based on one column in UNIX Eg: file1.txt 17328756,0000786623.pdf,0000786623 20115537,0000793892.pdf,0000793892 file2.txt 12521_74_4.zip,0000786623.pdf 12521_15_5.zip,0000793892.pdf Desired Output ... (5 Replies)
Discussion started by: techmoris
5 Replies

9. Shell Programming and Scripting

Merge Two Files based on First column

Hi, I need to join two files based on first column of both files.If first column of first file matches with the first column of second file, then the lines should be merged together and go for next line to check. It is something like: File one: 110001 abc efd 110002 fgh dfg 110003 ... (10 Replies)
Discussion started by: apjneeraj
10 Replies

10. Shell Programming and Scripting

merge rows based on a common column

Hi guys, Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output. I have a file with the following content- A123 box1 B345 bat2 C431 my_id A123 service C431 box1 A123 my_id I need two different outputs- OUTPUT1 A123... (6 Replies)
Discussion started by: smriti_shridhar
6 Replies
Login or Register to Ask a Question