Look up data in another file and return another field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Look up data in another file and return another field
# 1  
Old 03-26-2012
Look up data in another file and return another field

I'm a newbie to perl and not much of a coder in general, but I must use it at my new task at work. I have two text files that are several hundred MBs large. I need to pull in data from a column in file 2 into file 1. Basically, file 2 is the lookup table and I can key off the unique id column that they both share. I'm not sure how to go about this though.

I figured I should do the following steps:
-add the new field in file 1 for the returned value to be sent
-delimit the files with tab and store in arrays
-loop through each record in the unique id field of file1 and then loop through the same field in file 2, returning the return_data field if there is a match

I wrote something similar in vba, but at this file size, that is not an option. A sample of the data is below. Any help is greatly appreciated.

File1:
col1 col2 col3 col4 uniq_id
blah1 blah2 blah3 blah4 id1
blah1 blah2 blah3 blah4 id2
blah1 blah2 blah3 blah4 id3

File2:
col1 col2 rtrn_data col4 uniq_id
blah1 blah2 rd77 blah4 id1
blah1 blah2 rd5 blah4 id2
blah1 blah2 rd6 blah4 id3

output:
col1 col2 col3 col4 uniq_id rtrn_data
blah1 blah2 blah3 blah4 id1 rd77
blah1 blah2 blah3 blah4 id2 rd5
blah1 blah2 blah3 blah4 id3 rd6
# 2  
Old 03-26-2012
AWK will be a lot easier than perl:

Code:
$ cat a.sh
#! /bin/sh


awk '
NR==FNR && NR>1 {
        map[$NF]=$3
        next
}
NR!=FNR && FNR==1 {
        print $0, "rtrn-data"
}
NR!=FNR && FNR>1 {
        print $0, map[$NF]
}' file2 file1

$ ./a.sh file2 file1
col1 col2 col3 col4 uniq_id rtrn-data
blah1 blah2 blah3 blah4 id1 rd77
blah1 blah2 blah3 blah4 id2 rd5
blah1 blah2 blah3 blah4 id3 rd6

If you really want perl code, run 'a2p' (Awk to Perl translator) to convert that awk code to perl.
# 3  
Old 03-27-2012
Code:
#!/usr/bin/perl -w

# Usage: ./this_script FILE2 FILE1

use strict;

my %id;

open my $f, '<', shift or die "$!";
<$f>; # skip the header
while (<$f>) {
    chomp;
    my @v = split;
    $id{$v[4]} = $v[2];
}
close $f;

print "col1 col2 col3 col4 uniq_id rtrn_data\n";

open $f, '<', shift or die "$!";
<$f>;
while (<$f>) {
    chomp;
    my $uniq_id = (split)[-1];
    print "$_ $id{$uniq_id}\n";
}
close $f;

# 4  
Old 03-27-2012
Alternatively, if the files are sorted on the id field, you could try this:
Code:
join -1 5 -2 5 -o 1.1,1.2,1.3,1.4,0,2.3 File1 File2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Add a carriage return after every sorted field

Hi All, I have a text file - nmn-smt-1039.test.com:SearchService-WW:x:8277 nmn-smt-1102.test.com:AdminConsole-ww:x:8536 nmn-smt-1041.test.com:SearchService-WW:x:8277 nmn-wsf-1007.test.com:Service-ww:x:8532 nmn-smt-1042.test.com:SearchService-WW:x:8277... (3 Replies)
Discussion started by: jacki
3 Replies

2. Shell Programming and Scripting

awk command to return only field with a number in it

What is an awk command to print only fields with a number in it?? Input file....... S,S,S,S,S,S,S,S,S 001S,S,S,S,S,S,S,S,S 00219S,23S,24S,43S,47S,S,S,S,S 00319S,10S,23S,41S,43S,47S,S,S,S 00423S,41S,43S,46S,47S,S,S,S,S 00510S,23S,24S,43S,46S,S,S,S,S 00610S,23S,43S,46S,47S,S,S,S,S... (2 Replies)
Discussion started by: garethsays
2 Replies

3. Shell Programming and Scripting

Replace field in one file with whole record data of another

Hello Group, I need to replace the city field in “File 1 (fileld 3), with the entire record line of “File 2” (including delimiters) where the “city” field (File 1, Field 3)matches city field (File 2, Field1). All of the other data in “File 1” should remain intact(Fields 1,2,4,5,6). Only field... (1 Reply)
Discussion started by: vestport
1 Replies

4. Shell Programming and Scripting

Awk to remove carriage return from 65th field

Hi, I have a pipe delimited file. There are around 700 columns in the file. The 65th column has carriage return which is causing read issue with our ETL process. I would like to replace the new line characters in 65th field with "nothing" i have return the following code and need help to... (7 Replies)
Discussion started by: pinnacle
7 Replies

5. Shell Programming and Scripting

Text file to CSV with field data separated by blank lines

Hello, I have some data in a text file where fields are separated by blank lines. There are only 6 fields however some fields have several lines of data as I will explain. Also data in a particular field is not consistently the same size but does end on a blank line. The first field start with... (6 Replies)
Discussion started by: vestport
6 Replies

6. UNIX for Dummies Questions & Answers

Sorting data in file based on field in another file

Hi, I have two files, one of which I would like to sort based on the order of the data in the second. I would like to do this using a simple unix statement. My two files as follows: File 1: 12345 1 2 2 2 0 0 12349 0 0 2 2 1 2 12350 1 2 1 2 2 2 . . . File2: 12350... (3 Replies)
Discussion started by: kasan0
3 Replies

7. Shell Programming and Scripting

extract data in a csv file based on a certain field.

I have a csv file that I need to extract some data from depending on another field after reading info from another text file. The text file would say have 592560 in it. The csv file may have some data like so Field 1 Field2 Field3 Field4 Field5 Field6 20009756 1 ... (9 Replies)
Discussion started by: GroveTuckey
9 Replies

8. Shell Programming and Scripting

Extract data into file with specific field specs

:confused: I have a tab delimited file that I need to extract data from and into a file with specific field specs. Each field has to be a certain amount of characters. So, the name field (from delimited file) might have only 15 characters but needs to be 25 (in new file) so I need to insert spaces... (5 Replies)
Discussion started by: criddel
5 Replies

9. Shell Programming and Scripting

Find lines in text file with certain data in first field

Hi all, Sorry for the title, I was unsure how to word my issue. I'll get right to the issue. In my text file, I need to find all lines with the same data in the first field. Then I need to create a file with the matching lines merged into one. So my original file will look something like... (4 Replies)
Discussion started by: rstev39147
4 Replies

10. Shell Programming and Scripting

AWK - printing certain fields when field order changes in data file

I'm hoping someone can help me on this. I have a data file that greatly simplified might look like this: sec;src;dst;proto 421;10.10.10.1;10.10.10.2;tcp 426;10.10.10.3;10.10.10.4;udp 442;10.10.10.5;10.10.10.6;tcp sec;src;fac;dst;proto 521;10.10.10.1;ab;10.10.10.2;tcp... (3 Replies)
Discussion started by: eric4
3 Replies
Login or Register to Ask a Question