Join and merge multiple files with duplicate key and fill void columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Join and merge multiple files with duplicate key and fill void columns
# 1  
Old 11-05-2019
Join and merge multiple files with duplicate key and fill void columns by using AWK

Join and merge multiple files with duplicate key and fill void columns

Hi guys,

I have many files that I want to merge:

file1.csv:

Code:
1|abc
1|def
2|ghi
2|jkl
3|mno
3|pqr


file2.csv:

Code:
1|123|jojo
1|NULL|bibi
3|789|zaza
3|012|NULL


file3.csv:

Code:
2|1a2b|1994|US
3|3c4e|1995|MG
4|5f6g|1996|FR


My desired output is like this, conserve the duplicate key and fill it with the data of anothers files, and the void columns is filling by NULL:

Code:
1|abc|123|jojo|NULL|NULL|NULL
1|def|123|jojo|NULL|NULL|NULL
1|abc|NULL|bibi|NULL|NULL|NULL
1|def|NULL|bibi|NULL|NULL|NULL
2|ghi|NULL|NULL|1a2b|1994|US
2|jkl|NULL|NULL|1a2b|1994|US
3|mno|789|zaza|3c4e|1995|MG
3|pqr|789|zaza|3c4e|1995|MG
3|mno|012|NULL|3c4e|1995|MG
3|pqr|012|NULL|3c4e|1995|MG
4|NULL|NULL|NULL|5f6g|1996|FR

How to do that with awk, join or shell scripts?

Last edited by yjacknewton; 11-06-2019 at 05:03 AM..
# 2  
Old 11-05-2019
Any attempts / ideas / thoughts from your side?


The fourth line of your desired output doesn't seem to be consistent.
# 3  
Old 11-05-2019
Quote:
Originally Posted by RudiC
Any attempts / ideas / thoughts from your side?

The fourth line of your desired output doesn't seem to be consistent.
Yes, I forgot to write "NULL"; from this column,
I have already seen many post in forum but nothing match with my desire

Last edited by yjacknewton; 11-05-2019 at 04:53 AM..
# 4  
Old 11-05-2019
The only algo to preserve duplicate data that I found is this:
awk 'BEGIN{FS=OFS="|"}NR==FNR{a[$1]=$0;next;}{print $0,a[$1]}' test*;

Does someone have some idea?
But result is not appropiate...
Code:
1|123|jojo|1|def
1|NULL|bibi|1|def
3|789|zaza|3|pqr
3|012|NULL|3|pqr
2|1a2b|1994|US|2|jkl
3|3c4e|1995|MG|3|pqr
4|5f6g|1996|FR|


Last edited by Scrutinizer; 11-05-2019 at 12:35 PM.. Reason: Replaced " with "
# 5  
Old 11-05-2019
Hi, try:
Code:
join -t \| -a 1 -a 2 -e 'NULL' -o 0 1.2 1.3 1.4 2.2 2.3 2.4 \
<(join -t \| -a 1 -a 2 -e 'NULL' -o 0 1.2 2.2 2.3 file1 file2) file3


Last edited by Scrutinizer; 11-05-2019 at 02:18 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 11-06-2019
Quote:
Originally Posted by Scrutinizer
Hi, try:
Code:
join -t \| -a 1 -a 2 -e 'NULL' -o 0 1.2 1.3 1.4 2.2 2.3 2.4 \ 
<(join -t \| -a 1 -a 2 -e 'NULL' -o 0 1.2 2.2 2.3 file1 file2) file3

I tried this code and here is the result:
Code:
1|abc|123|jojo|NULL|NULL|NULL
1|abc|NULL|bibi|NULL|NULL|NULL
1|def|123|jojo|NULL|NULL|NULL
1|def|NULL|bibi|NULL|NULL|NULL
2|ghi|NULL|NULL|1a2b|1994|US
2|jkl|NULL|NULL|1a2b|1994|US
2|sds|NULL|NULL|1a2b|1994|US
3|mno|789|zaza|3c4e|1995|MG
3|mno|012|NULL|3c4e|1995|MG
3|pqr|789|zaza|3c4e|1995|MG
3|pqr|012|NULL|3c4e|1995|MG
4|NULL|NULL|NULL|5f6g|1996|FR

Can someone give a solution by using awk?

Last edited by Scrutinizer; 06-06-2020 at 12:14 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies

2. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

3. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

4. Shell Programming and Scripting

Merge columns from multiple files

Hello and Good day I have a lot of files with same number of rows and columns.$2 and $3 are the same in all files . I need to merge $2,$3,$6 from first file and $6 from another files. File1: $1 $2 $3 $4 $5 $6... (8 Replies)
Discussion started by: ali.seifaddini
8 Replies

5. UNIX for Dummies Questions & Answers

Merge columns from multiple files

Hi all, I've searched the web for a long time trying to figure out how to merge columns from multiple files. I know paste will append columns like so: paste file1 file2 file3 file4 file5 ... But this becomes inconvenient when you want to append a large number of files into a single file. ... (2 Replies)
Discussion started by: torchij
2 Replies

6. Shell Programming and Scripting

Sort and join multiple columns using awk

Is it possible to join all the values after sorting them based on 1st column key and replace empty rows with 0 like below ? input a1 0 a1 1 a1 1 a3 1 b2 1 a2 1 a4 1 a2 1 a4 1 c4 1 a3 1 d1 1 a3 1 b1 1 d1 1 a4 1 c4 1 b2 1 b1 1 b2 1 c4 1 d1 1 output... (8 Replies)
Discussion started by: quincyjones
8 Replies

7. Shell Programming and Scripting

Merge multiple lines in same file with common key using awk

I've been a Unix admin for nearly 30 years and never learned AWK. I've seen several similar posts here, but haven't been able to adapt the answers to my situation. AWK is so damn cryptic! ;) I have a single file with ~900 lines (CSV list). Each line starts with an ID, but with different stuff... (6 Replies)
Discussion started by: protosd
6 Replies

8. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

9. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

10. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies
Login or Register to Ask a Question