select records from one file based on a second file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting select records from one file based on a second file
# 1  
Old 07-17-2009
select records from one file based on a second file

Hi all:

I have two files:

file1:
Code:
74 DS  9871 199009871    1 1990  4  1 165200 Sc
pr de te sa ox
      1.0      1.0  13.0000  35.7560    5.950
      3.0      3.0  13.0100  35.7550    5.970 
**
74 DS 99004 74DS99004 6738 1990  4  1 165200 Eb
pr de te sa ox
      1.0      1.0  13.0000  35.7560    5.950
      3.0      3.0  13.0100  35.7550    5.970 
**
29 CS   563 198700563  280 1987  7 27 043900 Ic
pr de te sa ox
      9.0      8.9  12.2700  35.6500    6.060
     11.0     10.9  12.2100  35.6490    6.100
**
74 DS  9871 199009871    8 1990  4  3 161500 Sc
pr de te sa ox
      1.0      1.0  13.4500  35.7420    5.940
      3.0      3.0  13.4500  35.7420    5.950
**
74 DS 99004 74DS99004 6911 1990  4  6 042500 Eb
pr de te sa
      2.0      2.0  15.3300  35.5920
      3.0      3.0  15.3300  35.5920
**

and file2:
Code:
29 CS   563 198700563  280 1987  7 27 043900 Ic
74 DS 99004 74DS99004 6738 1990  4  1 165200 Eb
74 DS 99004 74DS99004 6911 1990  4  6 042500 Eb

What I need is to retain from file1 the full records pointed in file 2 (not need to be ordered) in such as:

Code:
74 DS 99004 74DS99004 6738 1990  4  1 165200 Eb
pr de te sa ox
      1.0      1.0  13.0000  35.7560    5.950
      3.0      3.0  13.0100  35.7550    5.970 
**
29 CS   563 198700563  280 1987  7 27 043900 Ic
pr de te sa ox
      9.0      8.9  12.2700  35.6500    6.060
     11.0     10.9  12.2100  35.6490    6.100
**
74 DS 99004 74DS99004 6911 1990  4  6 042500 Eb
pr de te sa
      2.0      2.0  15.3300  35.5920
      3.0      3.0  15.3300  35.5920
**

I have tried several approaches with awk oneliners, but didn't succeed because don't know how to deal with multiple files each with dissimilar ORS and FS. That is for file1 RS and ORS are the awk defaults whereas for file2:
Code:
{RS="\\*\\*\n+";ORS="**\n"}.

Do you have a clue on how to proceed?

Thanks,

r.-

Last edited by rleal; 07-17-2009 at 05:42 AM.. Reason: bad post title
# 2  
Old 07-17-2009
Try...
Code:
awk 'FNR==NR{a[$0]=$0;next}($0 in a),$0=="**"' file2 file1

# 3  
Old 07-17-2009
Can you please explain your code, I always get confused while working with two files in awk.
Thanks,
# 4  
Old 07-17-2009
Hi Rakesh,

Code:
FNR==NR{a[$0]=$0;next}

is used for storing the first file (here file2) content into an array "a" based on the key "$0" ( which is a complete row).

Code:
($0 in a),$0=="**"'

The above code is verifying whether each row from the file1 is present in the array or not ?.if present print it till the row equals to "**".

Ygor:

Please let me know if i am wrong somewhere.
# 5  
Old 07-17-2009
That was absolutely brilliant! Thanks so much.

r.-
# 6  
Old 07-17-2009
Quote:
Originally Posted by panyam
Hi Rakesh,

Code:
FNR==NR{a[$0]=$0;next}

is used for storing the first file (here file2) content into an array "a" based on the key "$0" ( which is a complete row).

Ygor:

Please let me know if i am wrong somewhere.
And this is because FNR is reset for each file read...right?
# 7  
Old 07-17-2009
Quote:
Originally Posted by rakeshawasthi
And this is because FNR is reset for each file read...right?

Sorry i forgot to mention Smilie .Yes. "NR" will be keep on incrementing where as "FNR" will reset once a file processing is over and re start again for the next file.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Quick way to select many records from a large file

I have a file, named records.txt, containing large number of records, around 0.5 million records in format below: 28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2 28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2 ... Another file is a key file, named key.txt, which is the list of some numbers in the first column of... (5 Replies)
Discussion started by: zenongz
5 Replies

2. Shell Programming and Scripting

Delete the records in file based on lookup file.

Hi I have two files one.txt and two.txt one.txt 123 324 456 235 456 two txt abc one 000 123 abc abc one 000 456 abc abc one 000 122 abc abc one 000 111 abc My question here is, the records which are present in one.txt has to deleted in second file two.txt my output result... (2 Replies)
Discussion started by: Ganesh L
2 Replies

3. Shell Programming and Scripting

Short program to select lines from a file based on a second file

Hello, I use UBUNTU 12.04. I want to write a short program using awk to select some lines in a file based on a second file. My first file has this format with about 400,000 lines and 47 fields: SNP1 1 12.1 SNP2 1 13.2 SNP3 1 45.2 SNP4 1 23.4 My second file has this format: SNP2 SNP3... (1 Reply)
Discussion started by: Homa
1 Replies

4. UNIX for Dummies Questions & Answers

how to select lines from one file based on another file

Hi, I would like to know how can I select lines of one file based on a common ID column from another file (keeping the order of the second file). Example of file1: ID A B C D 1-30 1 2 3 5-60 4 5 6 1-20 7 8 9 Example of file2: ID chr pos 1-20 1 20 1-30 1 30 5-60 5 60 Desired... (2 Replies)
Discussion started by: fadista
2 Replies

5. UNIX for Dummies Questions & Answers

Filtering records from 1 file based on some manipulation doen on second file

Hi, I am looking for an awk script which should help me to meet the following requirement: File1 has records in following format INF: FAILEd RECORD AB1234 INF: FAILEd RECORD PQ1145 INF: FAILEd RECORD AB3215 INF: FAILEd RECORD AB6114 ............................ (2 Replies)
Discussion started by: mintu41
2 Replies

6. Shell Programming and Scripting

Block of records to select from a file

Hello: I am new to shell script programming. Now I would like to select specific records block from a file. For example, current file "xyz.txt" is containing 1million records and want to select the block of records from line number 50000 to 100000 and save into a file. Can anyone suggest me how... (3 Replies)
Discussion started by: nvkuriseti
3 Replies

7. Shell Programming and Scripting

Automatically select records from several files and then run a C executable file inside the script

Dear list its my first post and i would like to greet everyone What i would like to do is select records 7 and 11 from each files in a folder then run an executable inside the script for the selected parameters. The file format is something like this 7 100 200 7 100 250 7 100 300 ... (1 Reply)
Discussion started by: Gtolis
1 Replies

8. Shell Programming and Scripting

Splitting a file based on the records in another file

All, We receive a file with a large no of records (records can vary) and we have to split it into two files based on another file. e.g. File1: UHDR 2008112 "25187","00000022","00",21-APR-1991,"" ,"D",-000000519,+0000000000,"C", ,+000000000,+000000000,000000000,"2","" ... (2 Replies)
Discussion started by: er_ashu
2 Replies

9. Shell Programming and Scripting

Select a portion of file based on query

Hi friends :) I am having a small problem and ur help is needed... I have a long file from which i want to select only some portions after filtering (grep). My file looks like : header xxyy lmno xxyy wxyz footer header abcd xy pqrs footer . . (14 Replies)
Discussion started by: vanand420
14 Replies

10. UNIX for Dummies Questions & Answers

Select records based on search criteria on first column

Hi All, I need to select only those records having a non zero record in the first column of a comma delimited file. Suppose my input file is having data like: "0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005" "0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies
Login or Register to Ask a Question