Create file based on data from two other files

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Create file based on data from two other files
# 1  
Old 03-03-2017
Create file based on data from two other files

I have looked through several threads regarding merging files with awk and attempted using join however have been unsuccessful likely as I do not fully understand awk.
What I am attempting is to take a csv file which could be between 1 and 15,000 lines with 5 colums and another csv file that will between 1 and 200 lines and 2 colums and create a new file based on a match of field. The match pair is $5 in csv1 and $1 in csv 2.

csv1
Code:
SABBKRDHS	   00:02:DE:55:DB:74	SA3000	TS	TR011 
SABBNLRJL	   00:02:DE:62:2C:A2	SA3000	TS	TR002 
SABBPKRLF	   00:02:DE:66:D0:18	SA3000	TS	TR003 
SABBQFHXM	   00:02:DE:6A:A0:80	SA3000	TS	TR003 
SABBPKWXD	   00:02:DE:66:DE:26	SA3000	1	TR006B
SABBNLTLX	   00:02:DE:62:33:46	SA3000	TS	TR009 
SABBPLKBH	   00:02:DE:66:FA:9C	SA3000	TS	TR010 
SABBQLJVG	   00:02:DE:6B:DB:E0	SA3000	TS	TR010 
SABBNSGSR	   00:02:DE:63:88:AC	SA3000	TS	TR011 
SABBQLKKH	   00:02:DE:6B:DD:9A	SA3000	1	TR013 
SABBNLNJX	   00:02:DE:62:23:56	SA3000	TS	TR015 
SABBPNPNF	   00:02:DE:67:85:A8	SA3000	TS	TR023 
SABBNLZTF	   00:02:DE:62:40:C0	SA3000	TS	TR026

csv2
Code:
TR002 	qpska-1
TR003 	qpska-2
TR006B	qpska-3
TR009 	qpska-4
TR010 	qpska-5
TR011 	qpska-6
TR013 	qpska-7
TR015 	qpska-8
TR023 	qpska-9
TR026 	qpska-10
TR101 	qpska-11
TR102 	qpska-12
TR103 	qpska-13
TR104 	qpska-14


With the desired output to be csv3

Code:
   00:02:DE:62:2C:A2	qpska-1
   00:02:DE:66:D0:18	qpska-2
   00:02:DE:6A:A0:80	qpska-2
   00:02:DE:66:DE:26	qpska-3
   00:02:DE:62:33:46	qpska-4
   00:02:DE:66:FA:9C	qpska-5
   00:02:DE:6B:DB:E0	qpska-5
   00:02:DE:63:88:AC	qpska-6
   00:02:DE:6B:DD:9A	qpska-7
   00:02:DE:62:23:56	qpska-8
   00:02:DE:67:85:A8	qpska-9
   00:02:DE:62:40:C0	qpska-10
   00:02:DE:6C:16:C6	qpska-11
   00:02:DE:67:1F:FA	qpska-12
   00:02:DE:6B:E7:D8	qpska-12
   00:02:DE:61:C7:8E	qpska-13
   00:02:DE:6A:7F:26	qpska-13
   00:02:DE:62:17:CA	qpska-14

I would really like to understand how to format the awk command.

Thanks,
# 2  
Old 03-03-2017
Unfortunately, you didn't specify what to do if lines don't find a match - suppress? Print error message or default?
Howsoever, try
Code:
awk 'NR==FNR {T[$5] = $2; next} {print T[$1], $2}' OFS="\t" csv[12]
00:02:DE:62:2C:A2    qpska-1
00:02:DE:6A:A0:80    qpska-2
00:02:DE:66:DE:26    qpska-3
00:02:DE:62:33:46    qpska-4
00:02:DE:6B:DB:E0    qpska-5
00:02:DE:63:88:AC    qpska-6
00:02:DE:6B:DD:9A    qpska-7
00:02:DE:62:23:56    qpska-8
00:02:DE:67:85:A8    qpska-9
00:02:DE:62:40:C0    qpska-10
    qpska-11
    qpska-12
    qpska-13
    qpska-14

# 3  
Old 03-04-2017
Given the fact that fields in the last column of csv1 occur more than ones, I think it makes more sense that the join should be the other way around (the other way around it takes the last occurrence in column 5)

Code:
awk 'NR==FNR {A[$1]=$2; next} {print $2, A[$5]}' OFS="\t" csv2 csv1

output:
Code:
00:02:DE:55:DB:74	qpska-6
00:02:DE:62:2C:A2	qpska-1
00:02:DE:66:D0:18	qpska-2
00:02:DE:6A:A0:80	qpska-2
00:02:DE:66:DE:26	qpska-3
00:02:DE:62:33:46	qpska-4
00:02:DE:66:FA:9C	qpska-5
00:02:DE:6B:DB:E0	qpska-5
00:02:DE:63:88:AC	qpska-6
00:02:DE:6B:DD:9A	qpska-7
00:02:DE:62:23:56	qpska-8
00:02:DE:67:85:A8	qpska-9
00:02:DE:62:40:C0	qpska-10

Eventhough, I cannot figure out why:
Code:
00:02:DE:55:DB:74	qpska-6

is not present in the output sample in post #1
These 2 Users Gave Thanks to Scrutinizer For This Post:
# 4  
Old 03-06-2017
Thank you both for reply's and bare with me I want to make sure I understand so I don't have ask again.

{A[$1]=$2; next}

Is putting column 1 of file 2 in an array and make it variable $2?
next stop parsing file 2.
{print $2, A[$5]}
Then print variable $2 and value of column 5 (a[$5] when it has a match from array$1 file 1 to a value in file 2?
# 5  
Old 03-06-2017
Not quite.

A[$1]=$2: Create an element for array A indexed by $1 (the first field in the line) and assign $2 (the second field)'s contents.

next stop processing THIS actual line in file2; read in and process next line until done with file2

print $2, A[$5] When processing the second file (file1), print its second field, and use fifth field as an index into array A, and, after OFS, print that value or an empty string if no element exists.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Get information from one files, based on data from other file

Hello. I am trying to get some info from log file. I have fileA , which contains all the country prefixes (the file contains one column and "n" rows ). And i have fileB, which contains huge data of phone numbers (the file contains one column and "n" rows). What i want to do is, to count... (7 Replies)
Discussion started by: dragonfly85
7 Replies

2. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

3. Shell Programming and Scripting

From 2 files create 3rd file with uncommon data

Hi All, I have two files. File1 and File2. Want to create another file with all the records of File1 those are not present in File2. Please guide. Thanks in advanced. Anupam (3 Replies)
Discussion started by: Anupam_Halder
3 Replies

4. Shell Programming and Scripting

Create multiple files from single file based on row separator

Hello , Can anyone please help me to solve the below - Input.txt source table abc col1 char col2 number source table bcd col1 date col2 char output should be 2 files based on the row separator "source table" abc.txt col1 char (6 Replies)
Discussion started by: Pratik4891
6 Replies

5. Shell Programming and Scripting

Comparing Select Columns from two CSV files in UNIX and create a third file based on comparision

Hi , I want to compare first 3 columns of File A and File B and create a new file File C which will have all rows from File B and will include rows that are present in File A and not in File B based on First 3 column comparison. Thanks in advance for your help. File A A,B,C,45,46... (2 Replies)
Discussion started by: ady_koolz
2 Replies

6. Shell Programming and Scripting

Create files based on second column of a file

Hi All, I have a file which looks like this: 234422 1 .00222 323232 1 3232 32323 1 0.00222 1234 2 1211 2332 2 0.9 233 3 0.883 123 3 45 As you can see, the second column of the file is already sorted which I did using sort command. Now, I want to create files based on the second... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

7. Shell Programming and Scripting

Need help in writing a script to create a new text file with specific data from existing two files

Hi, I have two text files. Need to create a third text file extracting specific data from first two existing files.. Text File 1: Format contains: SQL*Loader: Release 10.2.0.1.0 - Production on Wed Aug 4 21:06:34 2010 some text ............so on...and somwhere text like: Record 1:... (1 Reply)
Discussion started by: shashi143ibm
1 Replies

8. UNIX for Advanced & Expert Users

Create a file based on multiple files

Hey everyone. I am trying to figure out a way to create a file that will be renamed based off of one of multiple files. For example, if I have 3 files (cat.ctl, dog.ctl, and bird.ctl) that gets placed on to an ftp site I want to create a single file called new.cat.ctl, new.dog.ctl, etc for each... (3 Replies)
Discussion started by: coach5779
3 Replies

9. Shell Programming and Scripting

create diffrent files based on other file and parameters list

I would like ot create shell script/ bash to create diffrent files based on a file and parameters list. Here is the detail example: I have a textfile and four static parameter files (having ‘?'). mainfile.txt has below records (this count may be more than 50) A200001 A200101 B200001... (9 Replies)
Discussion started by: raghav525
9 Replies

10. Shell Programming and Scripting

Compare two csv files by two colums and create third file combining data from them.

I've got two large csv text table files with different number of columns each. I have to compare them based on first two columns and create resulting file that would in case of matched first two columns include all values from first one and all values (except first two colums) from second one. I... (5 Replies)
Discussion started by: agb2008
5 Replies
Login or Register to Ask a Question