comparing columns of a file with another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting comparing columns of a file with another file
# 1  
Old 01-11-2011
comparing columns of a file with another file

Hello experts,

It has been just one month since I started with perl. I have the following doubt.

I have two files

File 1 looks like this
Code:
GKHGGSS0098       PPP.100.F.LE
GKHYXDF9081       KKK.100.F.LE
GKHSDFT6546       JKL.100.F.LE
GKHGGHJ3123       ABC.100.F.LE

File 2 looks like this
Code:
>GKHGGSS0098 
atatatatagacagatgaacagat
>GKHGGSS0098 
atatatatatatatatatatatatatatata
>GKHYXDF9081
gggacacatagacagatagaca
>GKHSDFT6546
gacagatatatatatatatatata
>GKHGGHJ3123
ggccgcgcgcatagacaccagatagacagat

So I want to write a script that reads column 2 in file 1 and considers only those entries whose first 3 letters are PPP or KKK.

In the above there is one PPP and one KKK so the script will take two entries from the file. The the script needs to see the first column value for PPP and KKK, which are GKHGGSS0098 and GKHYXDF9081.

Finally the script will compare these first column values of file one with that of file two (with the part after the > sign).

Once they match, the script will extract them from file two and store them in a result file.

So in this case, the output file will have
Code:
>GKHGGSS0098 
atatatatagacagatgaacagat
>GKHGGSS0098 
 atatatatatatatatatatatatatatata
>GKHYXDF9081
gggacacatagacagatagaca

I donot know how to go about with this. Should I use a hash. Please help

Newbie

Last edited by Franklin52; 01-11-2011 at 04:51 PM.. Reason: Please use code tags
# 2  
Old 01-11-2011
Try this,

Code:
#!/usr/bin/perl
open(FH,"<","file1");
open(FH1,"<","file2");
while (<FH>) {
if(/(.+?)\s+(PPP|KKK).*/) {$hs{$1}=$2;}
}
while(<FH1>) {
if ($p) {print $_;$p=0;}
if(/\>(.+?)\s+/) {if ($hs{$1}) {print "\>",$1,"\n";$p=1;}}
}

# 3  
Old 01-13-2011
Hey thank you so much. But however I want to create two resulting files. like a file which will be like file 1 but only contain
GKHGGSS0098 PPP.100.F.LE
GKHYXDF9081 KKK.100.F.LE

And another file like file 2 that will contain the corresponding data as printed on the screen now..

Please do let me know if I am clear
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match Columns in one file and extract columns from another file

Kindly help merging information from two files with the following data structure. I want to match for the CHR-SNP in Foo and get the columns that match from CHROM-rsID Fields 1 & 2 of foo may have duplicates, however, a joint key of Fields $1$2$3$4 is unique. Also would be helpful to clean up... (4 Replies)
Discussion started by: genehunter
4 Replies

2. Shell Programming and Scripting

Comparing Select Columns from two CSV files in UNIX and create a third file based on comparision

Hi , I want to compare first 3 columns of File A and File B and create a new file File C which will have all rows from File B and will include rows that are present in File A and not in File B based on First 3 column comparison. Thanks in advance for your help. File A A,B,C,45,46... (2 Replies)
Discussion started by: ady_koolz
2 Replies

3. Shell Programming and Scripting

Comparing Columns and writing a new file

I have a table with one column File1.txt 1 2 3 4 5 6 7 8 9 10 Another table with two columns; This has got a subset of entries from File 1 but not unique because they have differing values in col 2. File2.txt 1 a 2 d 2 f 6 r 6 e (3 Replies)
Discussion started by: cs_novice
3 Replies

4. Shell Programming and Scripting

Comparing columns in a file

I have two files. One a small one and another one is big. The smaller one look like this: Filename: 1.tmp 3453 0 326543 1 2321 0 3212 1 The big file looks like this: Filename 1.res 0.3232 2321 9.2922 123 0.983 3212 8.373 326543 0.9 3453 1.098 3432 I want to extract those lines... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

5. Shell Programming and Scripting

Remove duplicate lines from first file comparing second file

Hi, I have two files with below data:: file1:- 123|aaa|ppp 445|fff|yyy 999|ttt|jjj 555|hhh|hhh file2:- 445|fff|yyy 555|hhh|hhh The records present in file1, not present in file 2 should be writtent to the out put file. output:- 123|aaa|ppp 999|ttt|jjj Is there any one line... (3 Replies)
Discussion started by: gani_85
3 Replies

6. UNIX for Dummies Questions & Answers

Comparing the 2nd column in two different files and printing corresponding 9th columns in new file

Dear Gurus, I am very new to UNIX. I appreciate your help to manage my files. I have 16 files with equal number of columns in it. Each file has 9 columns separated by space. I need to compare the values in the second column of first file and obtain the corresponding value in the 9th column... (12 Replies)
Discussion started by: Unilearn
12 Replies

7. UNIX Desktop Questions & Answers

COMPARING COLUMNS IN A TEXT FILE

Hi, Good day. I currently have this data called database.txt and I would like to check if there are no similar values (all unique) on an entire row considering the whole column data is unique. the data is as follows cL1 cL2 cL3 cL4 a12 c13 b13 c15 b11 a15 c19 b11 c15 c17 b13 f14 with... (1 Reply)
Discussion started by: whitecross
1 Replies

8. Shell Programming and Scripting

Replace specific columns in one file with columns in another file

HELLO! This is my first post here! By the way, I think it is great that people do this. My question: I have two files, one is a .dilm and one is a .txt. It is my understanding that the .dilm file can be treated as a .txt file. I wrote another program where I was able to manipulate it as if it... (3 Replies)
Discussion started by: mehdib
3 Replies

9. Shell Programming and Scripting

Comparing Columns and printing the difference from a particular file

Gurus, I have one file which is having multiple columns and also this file is not always contain the exact columns; sometimes it contains 5 columns or 12 columns. Now, I need to find the difference from that particular file. Here is the sample file: param1 | 10 | 20 | 30 | param2 | 10 |... (6 Replies)
Discussion started by: buzzusa
6 Replies

10. Shell Programming and Scripting

Help with comparing columns from a csv file

Hi there, I have an csv file. I want to compare the 16th and 18th columns. They contain alpha numeric characters. Some are same and some are different. We have to pick the ones which are different. But with certain rules. 16th col. 18th col. ---------- ... (1 Reply)
Discussion started by: sickboy
1 Replies
Login or Register to Ask a Question