Visit The New, Modern Unix Linux Community


Reading 2 CSV files and filtering data based on group


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading 2 CSV files and filtering data based on group
# 1  
Reading 2 CSV files and filtering data based on group

I have two CSV files in the following format:
First file:
Code:
GroupID, PID:TID, IP, Port

Sample data:
Code:
0,1000:11,127.0.0.1,445
0,-1:-1,127.0.0.1,800
1,1000:11,127.0.0.1,445
1,-1:-1,127.0.0.1,900
2,1000:11,127.0.0.1,445
2,-1:-1,180.0.0.3,900

Second file:
Code:
IP,Port,PID

Sample data
Code:
127.0.0.1,445,1000
127.0.0.1,800,1000
127.0.0.1,900,2000

What I need to do is as follows:
1. in the first file, whereever I have PID as -1, I need to pick its IP and port and llok it up in 2nd file and replace the -1 PID with the PID from 2nd file
2. Now based on the groupid (we are sure that there would be exacty 2 records with same group id), ignore all those groups where we have the same PID or if any PID is still -1 (ip port not found in 2nd file)

So in the above sample data:
first step will lead us to
Code:
0,1000:11,127.0.0.1,445
0,1000:-1,127.0.0.1,800
1,1000:11,127.0.0.1,445
1,2000:-1,127.0.0.1,900
2,1000:11,127.0.0.1,445
2,-1:-1,180.0.0.3,900

and the second step should get us the desired output
Code:
1,1000:11,127.0.0.1,445
1,2000:-1,127.0.0.1,900

We do not need the file after 1st step but that is just for explanation.

Moderator's Comments:
Mod Comment How to use code tags

Last edited by Franklin52; 10-03-2012 at 10:07 AM.. Reason: Please use code tags for data and code samples
# 2  
There might be better ways. But, try:
Code:
awk -F, 'FNR==NR{a[$1,$2]=$3;next}
{split($2,b,":")
if(b[1]==-1 && ($3,$4) in a) b[1]=a[$3,$4]
$2=b[1] ":" b[2]
temp = (length(temp))?(temp RS $0):($0)
pid[FNR]=b[1]
}!(FNR%2){
if(pid[FNR]!=-1 && pid[FNR-1]!=-1 && pid[FNR]!=pid[FNR-1]) print temp;temp=""}' file2 OFS=, file1

# 3  
Code:
awk -F, 'FNR==NR{a[$2]=$3;next}
 {print a[$4]}' OFS=, file2.csv file1.csv

Thanks for your answer. I was trying this but even this is not working and it prints 6 empty rows. Can you please help?
# 4  
Quote:
Originally Posted by rakesh_arxmind
Code:
awk -F, 'FNR==NR{a[$2]=$3;next}
 {print a[$4]}' OFS=, file2.csv file1.csv

Thanks for your answer. I was trying this but even this is not working and it prints 6 empty rows. Can you please help?
I did not provide this solution!!!
# 5  
Yes I know. I tried the solution you gave but it didn't work. So I was trying myself in small steps and that too didn't work. Can you please check? Thanks in advance.
# 6  
The solution that I've provided works perfectly fine with your sample inputs.
Tested with the awk on AIX 6.1 and gawk on cygwin.
Which awk version are you using?
Is there any white-space in any of the fields of file1 and file2?
Would you post the output of head -2 <file>|od -bc of your 2 files?
# 7  
I found the issue. the row separator was somehow \r\n which was causing the issue. The moment we used
Code:
 RS="\r\n"

, it worked for us.

Thanks a lot for your help.

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #490
Difficulty: Medium
Java source code is compiled into Java bytecode which is actual machine code.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was... (3 Replies)
Discussion started by: sunilmudikonda
3 Replies

2. Shell Programming and Scripting

Filtering data from text to csv

Hello, Is there a way to filerter data from a text file as shown below to a Column e.g. hostname nfsmount as two separate column. Currently I could get hostname and the mount is appearing below.. using this script #! /bin/bash for i in `cat fqdn.txt` do echo "$i ............ " >>... (3 Replies)
Discussion started by: Cy Pqa
3 Replies

3. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

4. UNIX for Dummies Questions & Answers

Shell script to extract data from csv file based on certain conditions

Hi Guys, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (1 Reply)
Discussion started by: Vivekit82
1 Replies

5. UNIX for Dummies Questions & Answers

using sed delete a line from csv file based on specific data in two separate fields

Hello, :wall: I have a 12 column csv file. I wish to delete the entire line if column 7 = hello and column 12 = goodbye. I have tried everything that I can find in all of my ref books. I know this does not work /^*,*,*,*,*,*,"hello",*,*,*,*,"goodbye"/d Any ideas? Thanks Please... (2 Replies)
Discussion started by: Chris Eagleson
2 Replies

6. Shell Programming and Scripting

Matching and Merging csv data fields based on a common field

Dear List, I have a file of csv data which has a different line per compliance check per host. I do not want any omissions from this csv data file which looks like this: date,hostname,status,color,check 02-03-2012,COMP1,FAIL,Yellow,auth_pass_change... (3 Replies)
Discussion started by: landossa
3 Replies

7. Shell Programming and Scripting

extract data in a csv file based on a certain field.

I have a csv file that I need to extract some data from depending on another field after reading info from another text file. The text file would say have 592560 in it. The csv file may have some data like so Field 1 Field2 Field3 Field4 Field5 Field6 20009756 1 ... (9 Replies)
Discussion started by: GroveTuckey
9 Replies

8. Shell Programming and Scripting

Reading the data from CSV and performing search through shell script

Hello, I am working on building a script that does the below actions together in my Linux server. 1) First, have to read the list of strings mentioned in CSV and store it in the shell script 2) Second, pick one by one from the string list, and search a particular folder for files that... (2 Replies)
Discussion started by: vikrams
2 Replies

9. Shell Programming and Scripting

Help with pulling / filtering data from a .csv

Good day Gurus, I have a csv file that contains an inventory of active servers. This csv file contains a well over a hundred systems (IBM, SUN, HP). It also contains those systems details. See below for an example hostA,invver,1.02,20100430 hostA,date,08/30/2010,06:18 hostA,use,"Unknown... (4 Replies)
Discussion started by: LuffyDMonkey
4 Replies

10. UNIX for Dummies Questions & Answers

How to listout the files based on group by the date...?

Hi, How to listout the files based on group by of date...? suppose if have list of 10 files which was created by on today(Apr 18) and yesterday(Apr 17 ) in my personal directory like this if i issue ls -rtl | grep "Apr 18" list out the file which was created/updated only on Apr 18th. ... (3 Replies)
Discussion started by: psiva_arul
3 Replies

Featured Tech Videos