Extracting lines present in one file but missing in another using Perl


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Extracting lines present in one file but missing in another using Perl
# 1  
Old 03-25-2010
Extracting lines present in one file but missing in another using Perl

Hey
I have an input file containing a list of numbers like:
Code:
U01120.CDS.1
D25328.CDS.1
X15573.CDS.1
K03515.CDS.1
L44140.CDS.10
U24183.CDS.1
M97347.CDS.1
U05259.CDS.1

And another input file containing results created on the basis of the above input:
Code:
G6PT_HUMAN U01120.CDS.1 -1.9450 3.1706 -2.0536
K6PP_HUMAN D25328.CDS.1 0.5615 -0.0029 -2.1845
K6PL_HUMAN X15573.CDS.1 0.6284 0.9183 -2.9719
G6PI_HUMAN K03515.CDS.1 1.0377 1.9856 -1.1401
K6PF_HUMAN U24183.CDS.1 0.6435 2.5546 -3.3403
G6NT_HUMAN M97347.CDS.1 0.7862 -0.7197 2.6020
C79A_HUMAN U05259.CDS.1 -1.6145 -1.6145 1.9184
C4BP_HUMAN M62486.CDS.1 -0.9203 -0.0660 1.7583

The thing is that there should be one resultline pr input number but there are missing 77 lines(I checked this earlier in my program) in the resultfile. The following program I am writing to try and extract the numbers that doesn't result in an outputline since the files are too long to look through manually for the missing numbers.

I have put the numbers from the first line in one hash, %original, as keys and the first two lines of the second file in another hash, %results.
Now I reverse the two columns in %results. That way it's the keys in both hashes that I wanna compare.
The idea with the loop is to take the numbers in %original one by one and check if they are present in %results. If they are, I just wanna skip to check the next, but if a number is not present, I want it extracted and printed.
I only posted the part of the code that seems to not be working. But I have warnings and use strict turned on in the program.

Code:
 
my %revresults = reverse %results;
my $linemiss = 0;    #Variable for counting missing lines.
foreach my $key ( keys %original ) { 
    unless ( exists $results{$key} ) {
       $linemiss++;
       print "$key\n";    #This line gives the same output as the content of %original instead of only the missing 77 numbers.
    }
}
my $linedif = $lineorg - $lineres;   #Difference between original file and result file, should be same number as $linemiss but I get $linemiss = $lineorg and $linedif = 77.
print "$linedif $linemiss\n"; 
close OUT;

Can anyone tell me where I go wrong in the above loop? I don't have much expertise using Perl or programming in general so I'm not sure if I'm using the 'unless ( exists' part correctly. Is this the right way to compare only keys without regards to the values?

Last edited by radoulov; 03-26-2010 at 08:19 AM.. Reason: Added code tags on the input data.
# 2  
Old 03-26-2010
Try this:

Code:
perl -lane'
  $_{$F[1]} = 1 and next if @ARGV;
  print unless $_{$_};
  ' resultfile inputfile

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Systemd errors of missing file “No such file or directory” inspite of file being present

The contents of my service file srvtemplate-data-i4-s1.conf is Description=test service for users After=network.target local-fs.target Type=forking RemainAfterExit=no PIDFile=/data/i4/srvt.pid LimitCORE=infinity EnvironmentFile=%I . . . WantedBy=multi-user.target (0 Replies)
Discussion started by: rupeshkp728
0 Replies

2. UNIX for Dummies Questions & Answers

Adding variable value in the begining of all lines present in a file.

I am getting the varible value from a grep command as: var=$(grep "Group" File1.txt | sed 's/Group Name*//g;s/,//g;s/://g;s/-//g') which leaves me the value of $var=xyz. now i want to append $var value in the begining of all the lines present in the file. Can u please suggest? Input file: 1... (10 Replies)
Discussion started by: rade777
10 Replies

3. UNIX for Dummies Questions & Answers

Adding missing lines in file

Dear all, I have a file with two columns - the first column is increasing every 50, the second column is just count (e.g. 5). However, when count is zero, no line is present. Sample: How can I change the file so as to include lines with zero count? e.g. in the previous file to put... (4 Replies)
Discussion started by: TheTransporter
4 Replies

4. UNIX for Dummies Questions & Answers

Adding missing lines in file

Dear all, I have a file with two columns - the first column is increasing every 50, the second column is just count (e.g. 5). However, when count is zero, no line is present. Sample: 1950 7 2000 14 2050 7 2100 13 2150 10 2200 9 2250 7 2300 8 2350 7... (1 Reply)
Discussion started by: TheTransporter
1 Replies

5. Shell Programming and Scripting

File Comparison: Print Lines not present in another file

Hi, I have fileA.txt like this. B01B02 D0011718 B01B03 D0012540 B01B04 D0006145 B01B05 D0004815 B01B06 D0012069 B01B07 D0004064 B01B08 D0011988 B01B09 D0012071 B01B10 D0005596 B01B11 D0011351 B01B12 D0004814 B01C01 D0011804 I want to compare this against another file (fileB.txt)... (3 Replies)
Discussion started by: genehunter
3 Replies

6. UNIX for Advanced & Expert Users

query display number lines or records present in file only numeric value -without filename

Hi all Thanks in advance........... Please help me for this issue............ I have a file it has 11 records . I used the command like .... >$ wc -l file 11 file I'm getting output like 11 file (no.of records along with filename) here my requirement is, I want to display only... (3 Replies)
Discussion started by: ksrivani
3 Replies

7. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

8. UNIX for Dummies Questions & Answers

To get the lines that are not present in file

Hi I have got two files File1: Row1 Row2 Row3 Row4 File2: Row3 Row4 Now my requirement is search each and every line of file1 in file2 and if the record do not exist in file2 then write that to an output file. Output file should be as below Row1 Row2 (4 Replies)
Discussion started by: sbhuvana20
4 Replies

9. Shell Programming and Scripting

Help needed in extracting text present between two headers in .txt file

Hi All, Please help me out in fllowing problem. I have text file which contains the data in following format. Contents of file.txt are setregid02 Test that setregid() fails and sets the proper errno values when a non-root user attemps to change the real or effective... (2 Replies)
Discussion started by: varshit
2 Replies

10. UNIX for Dummies Questions & Answers

extracting selected few lines through perl

How can I extract few lines(like 10 to 15, top 10 and last 10) from a file using perl. I do it with sed, head and tail in unix scripting. I am new to perl. Appreciate your help. (2 Replies)
Discussion started by: paruthiveeran
2 Replies
Login or Register to Ask a Question