Join 2 CSVs based on 1 key


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Join 2 CSVs based on 1 key
# 1  
Old 09-15-2016
Join 2 CSVs based on 1 key

Hello,
I have 2 csv as follows:
Code:
a.csv:
name,phone,adress,car[drive]
xy,1234,asbd
yz,2134,asbdf   
tc,6789,salkdur

Code:
b.csv:
telphone,vehicle
2134,toyota
6789,bmw
1234,honda

What is need is this:
Code:
output.csv:
name,phone,adres,car[drive]
xy,1234,asbd,honda
yz,2134,asbdf,toyota
tc,6789,salkdur,bmw

I am able to join (--right) the 2 files based on the key "phone" using "csvjoin", and then copy/pasting appropriate column and deleting the unnecessary one.

Here is what I used:
Code:
csvjoin -c 2, 1 a.csv b.csv >out.csv

But I need some code to do these in 1 shot.

Thanks in advance.

Last edited by Zam_1234; 09-15-2016 at 07:16 PM..
# 2  
Old 09-15-2016
Hi,

Please try this:
Code:
awk 'NR==FNR{a[$2]=$0;next} { for(i in a) {  if ( i==$1 ) { print a[i] " ", $2 } }}'  a.csv b.csv | sort -k2

( or )
Code:
awk 'NR==1 {print $0} NR==FNR{a[$2]=$0;next} { for(i in a) {  if ( i==$1 ) { print a[i] " ", $2 | "sort -k2" } }}'  a.csv b.csv


Last edited by greet_sed; 09-15-2016 at 06:23 PM.. Reason: Updated sorted solution
# 3  
Old 09-15-2016
Try
Code:
join -1 2 -2 1 --header --nocheck file[12]
phone name adress car vehicle
1234 xy asbd honda
2134 yz asbdf  toyota
6789 tc salkdur bmw

Make sure the files are sorted.
This User Gave Thanks to RudiC For This Post:
# 4  
Old 09-15-2016
You say that the input files are CSVs, but that usually means that a single character is used to separate fields in those files. Instead of that, the files you have shown us have varying numbers of spaces between fields (and in the case of one line in a.csv), some extraneous spaces at the end of the line).

I will make the wild guess that your real input files are intended to be <tab> separated instead of separated by sequences of one or more <space>s. I further assume that you want the output to be in the same order as the sequence of lines that appear in the a.csv input file. If all of these assumptions are correct, the following seems to produce the output you want (except for the line with the trailing <tab> or <space>s in a.csv:
Code:
awk '
FNR == NR {
	car[$1] = $2
	next
}
$2 in car {
	$0 = $0 "\t" car[$2]
}
1' b.csv a.csv

which produces the output:
Code:
name  phone    adress    car
xy    1234    asbd	honda
yz    2134    asbdf   	toyota
tc    6789    salkdur	bmw

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
# 5  
Old 09-15-2016
Don,
The files are CSVs. I manually typed them in here, adn this is not the actual file or data, But just the structure of the file.

You are correct, I do want the preserve the order "a.csv" in "out.csv"

:-)
# 6  
Old 09-15-2016
If you choose not to show us sample data that reflects the contents of your actual files, the likelihood of getting a suggestion that will work for you diminishes greatly. If what you are given as suggestions don't work for you, we will have to assume that you will be able to modify our suggestions to work with your real data.

We certainly know that you won't complain that our suggestions don't work with your real data if you are unwilling to tell us what your real data looks like.
# 7  
Old 09-15-2016
Quote:
Originally Posted by Don Cragun
If you choose not to show us sample data that reflects the contents of your actual files, the likelihood of getting a suggestion that will work for you diminishes greatly. If what you are given as suggestions don't work for you, we will have to assume that you will be able to modify our suggestions to work with your real data.

We certainly know that you won't complain that our suggestions don't work with your real data if you are unwilling to tell us what your real data looks like.
Make sense. I am editing the original post with Comma Separated Values.

Thanks for your help. :-)
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join and merge multiple files with duplicate key and fill void columns

Join and merge multiple files with duplicate key and fill void columns Hi guys, I have many files that I want to merge: file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: (5 Replies)
Discussion started by: yjacknewton
5 Replies

2. Shell Programming and Scripting

Linux - Join 2 csv files with common key

Hi, I am trying to join 2 csv files, to create a 3rd output file with the joined data. Below is an example of my Input Data: Input File 1 NAME, FAV_FOOD, FAV_DRINK, ID, GENDER Bob, Fish, Coke, 1, M Lisa, Rice, Water, 2, F Jenny, Noodle, Tea, 3, F Ken, Pizza, Coffee, 4, M Lisa,... (7 Replies)
Discussion started by: RichZR
7 Replies

3. Shell Programming and Scripting

Generate Join clause based on key data

Hi, I have a file pk.txt which has pk data in following format TableName | PK Employee | id Contact|name,country My Output should be Employee | t1.id=s.id Contact| t1.name=s.name AND t1.country=s.country I started of like this: for LIST in `cat pk.txt` do... (5 Replies)
Discussion started by: wahi80
5 Replies

4. UNIX for Dummies Questions & Answers

Join 2 files based on certain column

I have file input1.txt 11103|11|OTTAWA|City|AA|CAR|0|0|1|-1|0|8526|2014-09-07 23:00:14 11103|11|OTTAWA|City|BB|TRAIN|0|0|2|-2|6|6359|2014-09-07 23:00:14 11104|11|CANADA|City|CC|CAR|0|0|2|-2|0|5947|2014-09-07 23:00:14 11104|11|CANADA|City|DD|TRAIN|0|0|2|-2|1|4523|2014-09-07 23:00:14... (5 Replies)
Discussion started by: radius
5 Replies

5. Shell Programming and Scripting

Join based on positions

I have two text files as shown below cat file1.txt Id leng sal mon 25671 34343 56565 5565 44888 56565 45554 6868 23343 23423 26226 6224 77765 88688 87464 6848 66776 23343 63463 4534 cat file2.txt Id number 25671 34343 76767 34234 23343 23423 66776 23343 (4 Replies)
Discussion started by: halfafringe
4 Replies

6. Shell Programming and Scripting

join two files based on one column

Hi All, I am trying to join to files based on one common column. Cat File1 ID HID Ab_1 23 Cd 45 df 22 Vv 33 Cat File2 ID pval Ab_1 0.3 Cd 10 Vv 0.0444 (3 Replies)
Discussion started by: newpro
3 Replies

7. Shell Programming and Scripting

Left Join in Unix based on Key?

So I have 2 files: File 1: 111,Mike,Stipe 222,Peter,Buck 333,Mike,Mills File 2: 222,Mr,Bono 444,Mr,Edge I want output to be below, where 222 records joined and all none joined records still in output 111,Mike,Stipe 222,Peter,Buck,Mr,Bono 333,Mike,Mills 444,Mr,Edge (4 Replies)
Discussion started by: stack
4 Replies

8. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

9. UNIX for Advanced & Expert Users

Relational Join (Composite Key)

hi, i have file 1: ====== 0501000|X1 0502000|X2 0501231|X3 0981222|X4 0502000|X6 0503000|X7 0932322|X8 file 2: ======= 050 0501 0502 09 098 (1 Reply)
Discussion started by: magedfawzy
1 Replies

10. Shell Programming and Scripting

Join 3 files using key column in a mapping file

I'm new of UNIX shell scripting. I'm recently generating a excel report in UNIX(file with delimiter is fine). How should I make a script to do it? 1 file to join comes from output of one UNIX command, the second from another UNIX command, and third from a database query. The key columes of all... (7 Replies)
Discussion started by: bigsmile
7 Replies
Login or Register to Ask a Question