Matching and extract data from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matching and extract data from a file
# 1  
Old 10-10-2015
Matching and extract data from a file

Gents,

Matching columns 1-19 in file1 and 20-38 in file 2, I would like to extract the data in the same order of file2.

file1
Code:
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001 
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001 
X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001

file2

Code:
44137.00  48629.001
44149.00  48617.002
44137.00  48425.001
44137.00  48641.001

output desired

Code:
X  7494     11511  44137.00  48629.001    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 2161 25921  44208.00  48198.00  49060.001
X  7494     11611  44149.00  48617.002    1  4321  44148.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002  433  8641  44160.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002  865 12961  44172.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 1297 17281  44184.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 1729 21601  44196.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 2161 25921  44208.00  47994.00  48856.001
X  7494     11711  44137.00  48425.001    1  4321  44148.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001  433  8641  44160.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001  865 12961  44172.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 1297 17281  44184.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 1729 21601  44196.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 2161 25921  44208.00  48210.00  49072.001
X  7494     11811  44137.00  48641.001    1  4321  44148.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001  433  8641  44160.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001  865 12961  44172.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 1297 17281  44184.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 1729 21601  44196.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 2161 25921  44208.00  48006.00  48868.001

Thanks for your help.
# 2  
Old 10-10-2015
Assuming with "columns" you mean character positions, and that you swapped those in file 1 and 2, try
Code:
awk '
NR==FNR {IX=substr($0,20,19)
         T[IX] = T[IX] DL[IX] $0
         DL[IX] = "\n"
         next
        }
        {print T[$0]
        }
' file1 file2

This User Gave Thanks to RudiC For This Post:
# 3  
Old 10-10-2015
Dear RudiC
I dont get the disared
Should i need to change something in file2?
# 4  
Old 10-10-2015
Quote:
Originally Posted by jiam912
Dear RudiC
I dont get the disared
Should i need to change something in file2?
I'm not sure what "disared" means.

And the output you showed in post #1 in this thread does not match your requirements where you said "I would like to extract the data in the same order of file2."

Note that the 1st line in your sample file2 is:
Code:
44137.00  48629.001

and the lines in your sample file1 that contain those values in character positions 20 through 38 are:
Code:
X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001

Which matches the 1st six lines of output produced RudiC's script (including copying the trailing space characters that are present in file1:
Code:
X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001 
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001

and that the 1st six lines of output you said you desired:
Code:
X  7494     11511  44137.00  48629.001    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 2161 25921  44208.00  48198.00  49060.001

do not contain any trailing spaces, and, even if they did, these lines do not appear anywhere in file1 in your sample data.

RudiC's code followed your stated requirements but did not produce the output you said you wanted to produce. So, are your requirements wrong, or is your sample output wrong?
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 10-11-2015
Do you want to extract lines from file1 (as specified) OR do you want to modify them?
# 6  
Old 10-11-2015
Dear RudiC and Don

I wan to extract the data from file1, sorted as reference file2 .. Please I need the output like this.

Code:
X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001

The previous output was wrong.

Sorry for the inconvenience and thanks for your help
# 7  
Old 10-11-2015
Hi, try:
Code:
awk 'NR==FNR{A[$4,$5]=A[$4,$5] $0 ORS; next} {printf "%s",A[$1,$2]}' file1 file2

This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract range from config file matching pattern

I have config file like this: server_name xx opt1 opt2 opt3 suboptions1 #suboptions - disabled suboptions2 pattern suboptions3 server_name yy opt1 opt2 opt3 suboptions1 pattern #suboptions - disabled suboptions2 So basically I want to extract the server... (1 Reply)
Discussion started by: nemesis911
1 Replies

2. Shell Programming and Scripting

How can I retrieve the matching records from data file mentioned?

XYZNA0000778800Z 16123000012300321000000008000000000000000 16124000012300322000000007000000000000000 17234000012300323000000005000000000000000 17345000012300324000000004000000000000000 17456000012300325000000003000000000000000 9 XYZNA0000778900Z 16123000012300321000000008000000000000000... (8 Replies)
Discussion started by: later_troy
8 Replies

3. Shell Programming and Scripting

Extract data from a file

Hi , I am having a file which is PIPE delimited like this : file.txt aus|start|10:00:00 nz|start|11:00:00 aus|end|10:10:00 us|start|10:00:00 nz|end|11:10:00 us|end|11:00:00 . . . I want to extract an output file like this based on start time and end time for each countries: (9 Replies)
Discussion started by: rohit_shinez
9 Replies

4. Shell Programming and Scripting

Extract data from a file

I have a text file that contains the following data. For example, aa.txt has some numbers. I need to extract the continuous numbers(minimum 3 numbers) from it.How can I do this with awk? >aa.txt 31 35 36 37 38 39 44 169 170 173 174 175 177 206 >1a.txt 39 (5 Replies)
Discussion started by: rahmanabdulla
5 Replies

5. Shell Programming and Scripting

Extract data from a file

Hello All, I have a small xml file which looks like below: <Check:defaultval Val="crash" value="crash_report_0013&#xA;generate_check_0020 generate_check_0022&#xA;&#xA;This is where the fault is."/> <Check:defaultval Val="crash" value="crash_report_1001&#xA;generate_check_1001... (9 Replies)
Discussion started by: suvendu4urs
9 Replies

6. Shell Programming and Scripting

Extract header data from one file and combine it with data from another file

Hi, Great minds, I have some files, in fact header files, of CTD profiler, I tried a lot C programming, could not get output as I was expected, because my programming skills are very poor, finally, joined unix forum with the hope that, I may get what I want, from you people, Here I have attached... (17 Replies)
Discussion started by: nex_asp
17 Replies

7. Shell Programming and Scripting

Want to read data from a file name.txt and search it in another file and then matching...

Hi Frnds... I have an input file name.txt and another file named as source.. name.txt is having only one column and source is having around 25 columns...i need to read from name.txt line by line and search it in source file and then save the result in results file.. I have a rough idea about the... (15 Replies)
Discussion started by: ektubbe
15 Replies

8. Shell Programming and Scripting

Script to read file and extract data by matching pattern

Hello, I have a file ( say file1) which has lines like below. xxxx:xxxx,yyyy,1234,efgh zzzz:zzzz,kkkk,pppp,1234,xxxx,uuuu,oooo dddd:dddd here the word before ":" ( ie: xxxx) is the file name and the string after : are also file names, but each file name separated by "," In case of... (20 Replies)
Discussion started by: pradeepmacha
20 Replies

9. Shell Programming and Scripting

extract data from file

I m new to shell scripting & i need a help.... i have file like.... Name := sachin address:=something phone:=111 ... Note: There might be or not space between Name & := and between := & sachin. I need to extract the data from each line of file as var1=Name value1=sachin same for... (13 Replies)
Discussion started by: ps_sach
13 Replies

10. Shell Programming and Scripting

extract data from file

Hello again, how do you extract data from a file? I have created a file with PID #s in it, I need to be able to take the PID from each line and kill it. How is this done? (4 Replies)
Discussion started by: raidzero
4 Replies
Login or Register to Ask a Question