Merge input from two files into one based on conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge input from two files into one based on conditions
# 1  
Old 05-20-2015
Merge input from two files into one based on conditions

Using Linux (bash), I have two files which contain information about berries. Example:

file1.txt:
Code:
Blueberry blue 14
Raspberry red 12
Blackberry dark 4

file2.txt
Code:
Blackberry sour 4 3
Blueberry tasty 12 78
Strawberry yummy 33 88

I want to merge these two files into one. The desired result from the above would be:
Code:
Blueberry blue 14 tasty 12 78
Raspberry red 12 - - -
Blackberry dark 4 sour 4 3
Strawberry - - yummy 33 88

So the three columns of file1.txt and the four columns of file2.txt should result in a new file with six columns where all berries present in these two files only are represented with one row each. If a berry is not present in one of the two files, the corresponding entries from that file are replaced with a dash (like for Strawberry and Raspberry above).

This command is getting me somewhere close:
Code:
awk 'FNR==NR{a[$1]=$2 FS $3 FS $4;next}{ print $0, a[$1]}' file2.txt file1.txt

Result:
Code:
Blueberry blue 14 tasty 12 78
Raspberry red 12
Blackberry dark 4 sour 4 3

Only prints the berries that are present in file1.txt though and it doesn't add the dashes. Can someone help?

Thanks!
/Z
# 2  
Old 05-20-2015
You're not too far off:
Code:
awk 'FNR==NR{a[$1]=$2 FS $3 FS $4;next}{ print $0, a[$1]?a[$1]:"- -"; delete a[$1]} END {for (i in a) print i, "- -", a[i]}' file2 file1
Blueberry blue 14 tasty 12 78
Raspberry red 12 - -
Blackberry dark 4 sour 4 3
Strawberry - - yummy 33 88

This User Gave Thanks to RudiC For This Post:
# 3  
Old 05-20-2015
Thanks RudiC! Just get a minor error (also added one missing dash in your command):
Code:
$ awk 'FNR==NR{a[$1]=$2 FS $3 FS $4;next}{ print $0, a[$1]?a[$1]:"- - -"; delete a[$1]} END {for (i in a) print i, "- -", a[i]}' file2.txt file1.txt
Blueberry blue 14 tasty 12 78
Raspberry red 12 - - -
Blackberry dark 4 sour 4 3
 - - -
Strawberry - - yummy 33 88

So there's an extra "- - -" there that I can't explain. Bonus question: if I don't know how many columns file1 and file2 have, is there some way of making this command more dynamic?

Thanks!
/Z
# 4  
Old 05-20-2015
If you sort your files you can then use join:

Code:
$ join -j 1 -e "-" -a 1 -a 2 -o 0,1.2,1.3,2.2,2.3,2.4 file1.sor file2.sor
Blackberry dark 4 sour 4 3
Blueberry blue 14 tasty 12 78
Raspberry red 12 - - -
Strawberry - - yummy 33 88

This User Gave Thanks to Chubler_XL For This Post:
# 5  
Old 05-21-2015
Dynamic case:
Code:
awk '
FNR==NR         {a[$1]=$2 
                 d="-"
                 for (i=3; i<=NF; i++) {a[$1]=a[$1] FS $i; d=d FS "-"}
                 next
                }
                {print $0, a[$1]?a[$1]:d
                 D="-"
                 for (i=3; i<=NF; i++) D=D FS "-"
                 delete a[$1]
                }
END             {for (i in a) print i, D, a[i]
                }
        ' file2 file1

That extra line doesn't appear when I try it...
This User Gave Thanks to RudiC For This Post:
# 6  
Old 05-22-2015
Thanks a lot RudiC. #5 was just what I was looking for. :-)
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Linux

Merge two files based on matching criteria

Hi, I am trying to merge two csv files based on matching criteria: File description is as below : Key_File : 000|ÇÞ|Key_HF|ÇÞ|Key_FName 001|ÇÞ|Key_11|ÇÞ|Sort_Key22|ÇÞ|Key_31 002|ÇÞ|Key_12|ÇÞ|Sort_Key23|ÇÞ|Key_32 003|ÇÞ|Key_13|ÇÞ|Sort_Key24|ÇÞ|Key_33 050|ÇÞ|Key_15|ÇÞ|Sort_Key25|ÇÞ|Key_34... (3 Replies)
Discussion started by: PK29
3 Replies

2. Shell Programming and Scripting

Need script for making files based on some conditions.

Hi All, I have a text file (code_data.txt) with the followig data. AMAR AB123456 XYZ KIRAN CB789 ABC RAJ CS78890 XYZ KAMESH A33535335 ABC KUMAR MD678894 MAT RITESH SR3535355... (26 Replies)
Discussion started by: ROCK_PLSQL
26 Replies

3. Shell Programming and Scripting

Merge two input files and concatenate the rest

Hi Guys, I need a help to creating a bash script to merging two files that containing not in order pattern into single file also within that single file I need to concatenate and manipulate the string from two files. Appreciate for any kind help given. Thanks. Here are the input files:... (3 Replies)
Discussion started by: jabriel
3 Replies

4. Shell Programming and Scripting

Merge files based on columns

011111123444 1234 1 20000 011111123444 1235 1 30000 011111123446 1234 3 40000 011111123447 1234 4 50000 011111123448 1234 3 50000 File2: 011111123444,Rsttponrfgtrgtrkrfrgtrgrer 011111123446,Rsttponrfgtrgtr 011111123447,Rsttponrfgtrguii 011111123448,Rsttponrfgtrgtjiiu I have 2 files... (4 Replies)
Discussion started by: vinus
4 Replies

5. Shell Programming and Scripting

Read input files and merge them in given order and write them to input one param or one file

Dear Friends, I am looking for a shell script to merge input files into one file .. here is my idea: 1st paramter would be outfile file (all input files content) read all input files and merge them to input param 1 ex: if I pass 6 file names to the script then 1st file name as output file... (4 Replies)
Discussion started by: hyd1234
4 Replies

6. Shell Programming and Scripting

Merge files based on the column value

Hi Friends, I have a file file1.txt 1|ABC|3|jul|dhj 2|NHU|4|kil|eu 3|hjd|34|hfd|43 file2.txt 1||3|KING|dhj 2|NHU||k| 3|hjd|34|hd|43 i want to merge file1.txt file2.txt based on the column null values in file2.txif there are any nulls in column values , (5 Replies)
Discussion started by: i150371485
5 Replies

7. Shell Programming and Scripting

awk merging files based on 2 complex conditions

1. if the 1st row IDs of input1 (ID1/ID2.....) is equal to any IDNames of input2 print all relevant values together as defined in the output. 2. A bit tricky part is IDno in the output. All we need to do is numbering same kind of letters as 1 (aa of ID1) and different letters as 2 (ab... (4 Replies)
Discussion started by: ruby_sgp
4 Replies

8. UNIX for Dummies Questions & Answers

any script for joining files based on simple conditions

Condition1; If NPID and IndID of both input1 and input2 are same take all the vaues relevant to them and print together as output Condition2; IDNo in output: Take the highly repeated same letter of similar NPID-IndID as *1* Second highly repeated same letter... (0 Replies)
Discussion started by: stateperl
0 Replies

9. Shell Programming and Scripting

Merge Two Files based on First column

Hi, I need to join two files based on first column of both files.If first column of first file matches with the first column of second file, then the lines should be merged together and go for next line to check. It is something like: File one: 110001 abc efd 110002 fgh dfg 110003 ... (10 Replies)
Discussion started by: apjneeraj
10 Replies

10. Shell Programming and Scripting

Merge files based on key

Hi Friends, Can any one help me with merging these file based on two columns : File1: A|123|99|SAMS B|456|95|GEORGE D|789|85|HOVARD File2: S|123|99|NANcY|6357 S|123|99|GREGRO|83748 A|456|95|HARRY|827|somers S|456|95|ANTONY|546841|RUDOLPH|7263 B|456|95|SMITH|827|BOISE STATE|834... (3 Replies)
Discussion started by: sbasetty
3 Replies
Login or Register to Ask a Question