Merge two columns from two files into one if another column matches


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge two columns from two files into one if another column matches
# 1  
Old 10-29-2012
Merge two columns from two files into one if another column matches

I have two text files that look something like this:

Code:
A:B:C 123
D:E:F 234
G:H:I 345
J:K:L 123
M:N:O 456
P:Q:R 567

Code:
A:B:C 456
D:E:F 567
G:H:I 678
J:K:L 456
M:N:O 789
P:Q:R 890

I want to find the line where the first column matches and then combine the second columns into a single column so the output looks like this:

Code:
A:B:C 123456
D:E:F 234567
G:H:I 345678
J:K:L 123456
M:N:O 456789
P:Q:R 567890

After that, I want to search through the second columns and remove duplicate ones. In this case, that would mean the A:B:C line and the J:K:L line have matching second columns, so I would remove the J:K:L line.

Code:
A:B:C 123456
D:E:F 234567
G:H:I 345678
M:N:O 456789
P:Q:R 567890

I've tried some awk, but I am not coming anywhere close with this one. Smilie The files can also get pretty big with hundreds of millions of lines, so a speedier option would be best. Thanks!
# 2  
Old 10-29-2012
Code:
join -1 1 -2 1 join1.txt join2.txt | sed 's/\([0-9]\) \([0-9]\)/\1\2/g'

# 3  
Old 10-29-2012
Try:
Code:
paste file1 file2 | awk '!A[$2$4]++{print $1,$2 $4}'

This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 10-29-2012
Hi

Code:
$ join -1 1 -2 1  f1  f2 | awk '!a[$2$3]++{print $1,$2$3}'
A:B:C 123456
D:E:F 234567
G:H:I 345678
M:N:O 456789
P:Q:R 567890

Guru.
This User Gave Thanks to guruprasadpr For This Post:
# 5  
Old 10-29-2012
Code:
$ awk 'NR==FNR{a[$1]=$2;next}$1 in a{if(!b[a[$1]$2]++)printf("%s %s%s\n",$1,a[$1],$2);}' file1 file2
A:B:C 123456
D:E:F 234567
G:H:I 345678
M:N:O 456789
P:Q:R 567890

This User Gave Thanks to itkamaraj For This Post:
# 6  
Old 10-29-2012
Quote:
Originally Posted by pbluescript
... The files can also get pretty big with hundreds of millions of lines, so a speedier option would be best. Thanks!
I'd really appreciate to know how far this solution
Quote:
Originally Posted by itkamaraj
Code:
$ awk 'NR==FNR{a[$1]=$2;next}...

can be driven, i.e. how many lines will make the awk arrays explode...
# 7  
Old 10-29-2012
Awk only:
Code:
awk '{p=$2; getline<f} !A[$2=p $2]++' f=file2 file1

Code:
awk '{getline $1<f; $0=$0} !A[$2]++' OFS= f=file1 file2


Last edited by Scrutinizer; 10-29-2012 at 02:26 PM..
These 2 Users Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk add all columns if column 1 name matches

Hi - I want to add all columns if column1 name matches. TOPIC1 5 1 4 TOPIC2 3 2 1 TOPIC3 7 2 5 TOPIC1 6 3 3 TOPIC2 4 1 3 TOPIC3 9 5 4 . . . . . . . . . . . . Result should look like TOPIC1 11 4 7 TOPIC2 7 3 4 (1 Reply)
Discussion started by: oraclermanpt
1 Replies

2. UNIX for Beginners Questions & Answers

Matches columns from two different files in shell script

Hi friends, i want to compare first columns from two different files ,if equal print the file2's second column else print the zero.Please help me... file1: a b c d efile2: a 1 c 20 e 30 desired output: 1 0 20 0 30 Please use CODE tags as required by forum rules! Please post in... (1 Reply)
Discussion started by: bhaskar illa
1 Replies

3. Shell Programming and Scripting

Seperated by columns, merge in a file, sort them on common column

Hi All, I have 4 files in below format. I took them as an example. File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT. CTOT 456787897 Low fever CTOR 556712345 High fever... (2 Replies)
Discussion started by: Mannu2525
2 Replies

4. Shell Programming and Scripting

Merge columns on different files

Hello, I have two files that have this format: file 1 86.82 0.00 86.82 43.61 86.84 0.00 86.84 43.61 86.86 0.00 86.86 43.61 86.88 0.00 86.88 43.61 file 2 86.82 0.22 86.84 0.22 86.86 0.22 86.88 0.22 I would like to merge these two files such that the final file looks like... (5 Replies)
Discussion started by: kayak
5 Replies

5. Shell Programming and Scripting

How to get difference of the same column between two files when other column matches?

File 1: 20130416,235800,10.78.25.104,BR2-loc,60.0,1624,50.0,0,50.0,0 20130416,235800,10.78.25.104,BR1-LOC,70.0,10,50.0,0,70.0,0 20130416,235800,10.78.25.104,Hub_None,60.0,15,60.0,0,50.0,0 File 2: 20130417,000200,10.78.25.104,BR2-loc,60.0,1626,50.0,0,50.0,0... (3 Replies)
Discussion started by: Lakshmikumari
3 Replies

6. UNIX for Dummies Questions & Answers

How do I merge multiple columns into one column?

Hi all, I'm looking for a way to merge multiple columns (from one file) into a single column in an output file. The file I have looks somewhat like this: @HWI-ST212 1:N:0 AGTCCTACCGGGAGT + @@@DDDDDHHHHHII @HWI-ST212 1:N:0 CGTTTAAAAATTTCT + @;@B;DDDDH?:F;F... (4 Replies)
Discussion started by: Vnguyen
4 Replies

7. Shell Programming and Scripting

How to merge multiple rows into single row if first column matches ?

Hi, Can anyone suggest quick way to get desired output? Sample input file content: A 12 9 A -0.3 2.3 B 1.0 -4 C 34 1000 C -111 900 C 99 0.09 Output required: A 12 9 -0.3 2.3 B 1.0 -4 C 34 1000 -111 900 99 0.09 Thanks (3 Replies)
Discussion started by: cbm_000
3 Replies

8. Shell Programming and Scripting

Merge columns of different files

Hi, I have tab limited file 1 and tab limited file 2 The output should contain common first column vales and corresponding 2nd column values; AND also unique first column value with corresponding 2nd column value of the file that contains it and 0 for the second file. the output should... (10 Replies)
Discussion started by: polsum
10 Replies

9. Shell Programming and Scripting

merge the two files which has contain columns

Hi may i ask how to accomplish this task: I have 2 files which has multiple columns first file 1 a 2 b 3 c 4 d second file 14 a 9 .... 13 b 10.... 12 c 11... 11 d 12... I want to merge the second file to first file that will looks like this ... (2 Replies)
Discussion started by: jao_madn
2 Replies

10. Shell Programming and Scripting

Joining columns from two files, if the key matches

I am trying to join/paste columns from two files for the rows with matching first field. Any help will be appreciated. Files can not be sorted and may not have all rows in both files. Thanks. File1 aaa 111 bbb 222 ccc 333 File2 aaa sss mmmm ccc kkkk llll ddd xxx yyy Want to... (1 Reply)
Discussion started by: sk_sd
1 Replies
Login or Register to Ask a Question