getting data from one file based on another


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers getting data from one file based on another
# 1  
Old 01-24-2011
getting data from one file based on another

Hi there,

I've got two files:

file1
HTML Code:
1 1234 1240 ABC
6 3456 10299 DCV
10 39480 50690 IGN
file 2
HTML Code:
1 1235 klp
8 2568 ghl
10 39500 ghk
I need to add column 4 of file1 as an additional column to file2 if the number in column2 file2 is in the range of columns 2 and 3 of file1 and column1 in file 2 matches column1 in file2. Foe example, take the first line in both files - the first columns are matching (both are 1), then column 2 of file2 (1235) is in the range of columns2 and 3 of file1 (1235 is between 1234 and 1240), thus, column 4 of file1 needs to be added to file2, line1. The result (for that one line) would be:

file3
1 1235 klp ABC

Similarly, there will be nothing for line2 in file2; and line 3 of file2 will get IGN from file1.

The final file3 should look like this:

HTML Code:
1 1235 klp ABC
8 2568 ghl -
10 39500 ghk IGN
Many thanks for your help!
# 2  
Old 01-24-2011
Loose thinking: you are combining two files and making a third, which is what "join" does.
# 3  
Old 01-24-2011
I think join will put two files together according to the identical fields, but I need also to kind of match the second field of file2 to fields 2 and 3 of file3. Once the 1st fields are matching, then the value of second column of file2 needs to be in the range of columns 2 and 3 of file1, and only then the 4th column of file1 should be added to file2. I am sorry for the confusion.

---------- Post updated at 01:04 PM ---------- Previous update was at 01:03 PM ----------

sorry, I mistyped: I need to kind of match the second column of file2 to the 2nd and 3rd columns of file1 (not 3 as I wrote before).
# 4  
Old 01-24-2011
Ah, the ranging bit. Well, not heavy duty but in shell you can read f1 f2 f3 f4 and compare fields, (( $a2 >= $b2 && $a2 <= $b3 )) for numeric comparison. You can use one big while loop, reading new a fields when a is short and new b fields when b is short. Just prime your key variables before the loop so they both read the first line, like with -9999, -9998, -9997. The first line for file a because the old a vlaue is low, the first line for file b because the old b vlaue is too old, and then print the join if they are in range. Make use of the fact that one side is 0/1 to 1/many. If it is many to many, then you have to scan one file for every line in the other.
# 5  
Old 01-24-2011
Many thanks for your suggestion. I am afraid my unix knowledge is not that great to completely understand what you meant. Also, the file is pretty big (~2G in size), so I assume I will scan each line in turn. If it is not too much to ask, could you please write an example script? Many thanks for your time. It is really appreciated.
# 6  
Old 01-25-2011
Try this:
Code:
awk 'NR==FNR{a[$1]=$2; b[$1]=$3; c[$1]=$4; next} 
a[$1] && $2 >= a[$1] && $2 <= b[$1] {$(NF+1)=c[$1]}
1' file1 file2

# 7  
Old 01-25-2011
Nice, but needs cooperative data:
  • Are the values in one file unique and not overlapping in range, as implied by the examples?
  • Are they both sorted by the keys ascending?
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. UNIX for Beginners Questions & Answers

Create file based on data from two other files

I have looked through several threads regarding merging files with awk and attempted using join however have been unsuccessful likely as I do not fully understand awk. What I am attempting is to take a csv file which could be between 1 and 15,000 lines with 5 colums and another csv file that will... (4 Replies)
Discussion started by: cdubu2
4 Replies

3. UNIX for Dummies Questions & Answers

Extracting data from one file, based on another file (splitting)

Dear All, I have two files but want to extract data from one based on another... can you please help me file 1 David Tom Ellen and file 2 David|0010|testnamez|resultsz David|0004|testnamex|resultsx Tom|0010|testnamez|resultsz Tom|0004|testnamex|resultsx Ellen|0010|testnamez|resultsz... (12 Replies)
Discussion started by: A-V
12 Replies

4. Shell Programming and Scripting

Generate tabular data based on a column value from an existing data file

Hi, I have a data file with : 01/28/2012,1,1,98995 01/28/2012,1,2,7195 01/29/2012,1,1,98995 01/29/2012,1,2,7195 01/30/2012,1,1,98896 01/30/2012,1,2,7083 01/31/2012,1,1,98896 01/31/2012,1,2,7083 02/01/2012,1,1,98896 02/01/2012,1,2,7083 02/02/2012,1,1,98899 02/02/2012,1,2,7083 I... (1 Reply)
Discussion started by: himanish
1 Replies

5. UNIX for Dummies Questions & Answers

Sorting data in file based on field in another file

Hi, I have two files, one of which I would like to sort based on the order of the data in the second. I would like to do this using a simple unix statement. My two files as follows: File 1: 12345 1 2 2 2 0 0 12349 0 0 2 2 1 2 12350 1 2 1 2 2 2 . . . File2: 12350... (3 Replies)
Discussion started by: kasan0
3 Replies

6. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

7. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

8. Shell Programming and Scripting

Delete line in file based on data in another file

Hi there I would like to create a shell script to do the following: - delete a line in file1 if it contains the data string in file2 eg: file1 1 100109942004051510601703694 0.00 0.00 2 100109942004051510601702326 0.00 0.00 3 ... (1 Reply)
Discussion started by: earth_goddess
1 Replies

9. UNIX for Dummies Questions & Answers

Rename file based on first 3 characters of data in file

I'm looking to determine if I can use a grep command to read file and rename the file based on the first 3 characters of the data in the file. An example is: Read FileA If the first 3 positions of the data in the file are "ITP", then rename the file as FileA_ITP, else if the first 3... (3 Replies)
Discussion started by: jchappel
3 Replies

10. Shell Programming and Scripting

Extracting data from text file based on configuration set in config file

Hi , a:) i have configuration file with pattren <Range start no>,<Range end no>,<type of records to be extracted from the data file>,<name of the file to store output> eg: myfile.confg 9899000000,9899999999,DATA,b.dat 9899000000,9899999999,SMS,a.dat b:) Stucture of my data file is... (3 Replies)
Discussion started by: suparnbector
3 Replies
Login or Register to Ask a Question