Compare two files and extract


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two files and extract
# 1  
Old 09-11-2014
Compare two files and extract

Assume we have two files - FileA and FileB. Content of files are as shown below :

FileA:
Code:
1001,value1,value4,value8,value9
1002,value4,value32,value46,value33
1503,value5,value45,value68,value53
1605,value4,value67,value56,value57
1073,value5,value45,value68,value53
1005,value4,value67,value56,value57
1006,value2,value75,value35,value79

FileB:
Code:
1001,value741,value474,value2568,value479
1002,value36,value3452,value3736,value145
1503,value47,value55,value68,value53
1605,value86,value87,value86,value5367
1073,value85,value85,value98,value333

Based on first column from FileB, compare , with FileA and extract the data from FileA and store in new file, the expected output should be:

Code:
1001,value1,value4,value8,value9
1002,value4,value32,value46,value33
1503,value5,value45,value68,value53
1605,value4,value67,value56,value57
1073,value5,value45,value68,value53

FileA will have more than 1 million lines [all unique] and FileB will be 50k [all unique lines], so, based on first column from FileB, extract from FileA. please advice the awk.

Moderator's Comments:
Mod Comment
Please use CODE tags for code, files, input & output/errors
It makes it much easier to read as fixed width text and preserves multiple spaces

Last edited by rbatte1; 09-11-2014 at 10:46 AM.. Reason: Added CODE tags and made file names concistent within the post.
# 2  
Old 09-11-2014
Please use code tags as required by forum rules!

Try
Code:
sed 's/^/^/;s/,.*$/,/' file2 | grep -f- file1

# 3  
Old 09-11-2014
Hello alnhk,

Following may help.

Code:
awk -F"," 'NR==FNR{a[$1]=$0;next} ($1 in a){print $0}' FileB FileA

Output will be as follows.

Code:
1001,value1,value4,value8,value9
1002,value4,value32,value46,value33
1503,value5,value45,value68,value53
1605,value4,value67,value56,value57
1073,value5,value45,value68,value53

Thanks,
R. Singh
# 4  
Old 09-11-2014
Hello alnhk,

I have a few to questions pose in response first:-
  • Is this homework/assignment? There are specific forums for these.
  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc.)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.


We're all here to learn and getting the relevant information will help us all.


Thanks, in advance,
Robin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 files and extract the data which is present in other file - awk is not working

file2 content f1file2 content f1,1,2,3,4,5 f1,2,4,6,8,10 f10,1,2,3,4,5 f10,2,4,6,8,10 f5,1,2,3,4,5 f5,2,4,6,8,10awk 'FNR==NR{a;next}; !($1 in a)' file2 file1output f10,1,2,3,4,5 f10,2,4,6,8,10 f5,1,2,3,4,5 f5,2,4,6,8,10awk 'FNR==NR{a;next}; ($1 in a)' file2 file1output nothing... (4 Replies)
Discussion started by: gksenthilkumar
4 Replies

2. Shell Programming and Scripting

Compare two files and extract info

Hello, I have two files which look like this File 1 Name test1 status P Gene1 0.00236753 1 1.00E-01 Gene2 0.134187 2 2.00E-01 Gene3 0.000608716 2 3.00E-01 Gene4 0.0016234 1 4.00E-01 Gene5 0.000665868 2 5.00E-01and file 2 No Pos ... (2 Replies)
Discussion started by: nans
2 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

Script to extract/compare from two files.

I have two files : Alpha and Beta. The files are as follows (without arrow marks.) Alpha: A 1 D 90 G 11 B 24 C 15 Beta: B 24 C 0 <-- G 11 D 20 <-- A 4 <-- E 777 <-- Expected output of the script : Alpha: (2 Replies)
Discussion started by: linuxadmin
2 Replies

5. Shell Programming and Scripting

Compare files & extract column awk

I have two tab delimited files as given below: File_1: PV16 E1 865 2814 1950 PV16 E2 2756 3853 1098 PV16 E4 3333 3620 288 PV16 E5 3850 4101 252 PV16 E6 83 559 477 PV16 E7 562 858 297 PV16 L2 4237 5658 ... (10 Replies)
Discussion started by: vaibhavvsk
10 Replies

6. Shell Programming and Scripting

Compare multiple files, and extract items that are common to ALL files only

I have this code awk 'NR==FNR{a=$1;next} a' file1 file2 which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
Discussion started by: castrojc
7 Replies

7. Shell Programming and Scripting

compare 2 files and extract the data which is not present in other file with condition

I have 2 files whose data's are as follows : fileA 00 lieferungen 00 attractiop 01 done 02 forness 03 rasp 04 alwaysisng 04 funny 05 done1 fileB alwayssng dkhf fdgdfg dfgdg sdjkgkdfjg funny rasp (7 Replies)
Discussion started by: rajniman
7 Replies

8. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

9. UNIX for Dummies Questions & Answers

how do i compare and extract similiar data

I have 2 files. The first file contains user names in one column. The second, and considerably longer, file contains user names in the first column and corresponding full names in the second column. Currently these are in the .xls format. I'd like to be able to compare file1 with file2 and extract... (2 Replies)
Discussion started by: raptrmastr
2 Replies

10. Shell Programming and Scripting

Compare and extract

Compare two files, search for data from position 1-6 if both matches then i need to extract those records only from file A cat File A A37985LUNGIUF7845049530113 F41604CHACAMA286000004371 cat File B C26344 F41604 o/p F41604CHACAMA286000004371 (8 Replies)
Discussion started by: ford2020
8 Replies
Login or Register to Ask a Question