compare 2 arrays in perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting compare 2 arrays in perl
# 8  
Old 08-06-2009
Quote:
Originally Posted by karla
except for 1st column, yes, only 5th is different unless the whole line doesnt exist.. and im looking for KB0005-KB204320 diff
Ok, back to the drawing board.

Instead of creating arrays/hashes etc. and comparing them, this perl program relies on the input being sorted. It simply runs through the data stream and keeps printing differences and pairs and singletons.
$cmp and $key/$prevkey are the main variables upon which the logic is built.

Code:
$
$ cat data.txt
KB0005 1019 T IFVATVPVI 0.691 PKC YES
KB0005 1036 T YFLQTSQQL 0.785 PKC YES
KB0005 1037 S FLQTSQQLK 0.585 DNAPK YES
KB0005 1045 S KQLESEGRS 0.669 PKC YES
KB0005 1045 S KQLESEGRS 0.880 unsp YES
KB204320 1019 T IFVATVPVI 0.699 PKC YES
KB204320 1036 T YFLQTSQQL 0.789 PKC YES
KB204320 1037 S FLQTSQQLK 0.589 DNAPK YES
KB204320 1045 S KQLESEGRS 0.880 unsp YES
$
$ cat testscr1.pl
#!/usr/bin/perl -w
$prevkey = "";
while (<>) {
  chomp;
  @x = split;
  $key = "$x[1]:$x[2]:$x[3]:$x[5]:$x[6]";
  $num = $x[4];
  $line = sprintf("%-10s [MESG] => %s %s %s %s %s %s\n",$x[0],$x[1],$x[2],$x[3],$x[4],$x[5],$x[6]);
  if ($prevkey eq "") {  # we are on line 1; just set $cmp to 1 and move on
    # A value of 1 means "start of comparison" - this line should be compared
    # with the next line for potential pairing. A value of 0 means
    # "end of comparison" - the comparison is over; we either found a pair or
    # found a non-repeating line.
    $cmp = 1;
  } elsif ($key eq $prevkey) {  # we found a pair
    $cmp = 0;
    # find diff
    $diff = sprintf("%6.3f",$prevnum - $num);
    # print prev and current lines if diff != 0
    if ($prevnum != $num) {
      $prevline =~ s/MESG/DIFF = $diff/;
      $line =~ s/MESG/DIFF = $diff/;
      print $prevline,$line;
    }
  } elsif ($key ne $prevkey) {  # we did not find a pair; either prev line is
                                # non repeating or we found and printed a pair
    # if $cmp equals 1 then print previous line else set $cmp to 1
    if ($cmp == 1) {
      $prevline =~ s/MESG/NO_REPETITION/;
      print $prevline;
    } else {
      $cmp = 1;
    }
  }
  $prevkey = $key;
  $prevline = $line;
  $prevnum = $num;
}
# if $cmp equals 1 then print previous line
if ($cmp == 1) {
  $prevline =~ s/MESG/NO_REPETITION/;
  print $prevline;
}
$
$ # Sorted input is absolutely essential for this perl program
$ # In the data below, all lines except line # 7 occur in pairs
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt
KB0005 1019 T IFVATVPVI 0.691 PKC YES
KB204320 1019 T IFVATVPVI 0.699 PKC YES
KB0005 1036 T YFLQTSQQL 0.785 PKC YES
KB204320 1036 T YFLQTSQQL 0.789 PKC YES
KB0005 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320 1037 S FLQTSQQLK 0.589 DNAPK YES
KB0005 1045 S KQLESEGRS 0.669 PKC YES
KB0005 1045 S KQLESEGRS 0.880 unsp YES
KB204320 1045 S KQLESEGRS 0.880 unsp YES
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt | perl testscr1.pl
KB0005     [DIFF = -0.008] => 1019 T IFVATVPVI 0.691 PKC YES
KB204320   [DIFF = -0.008] => 1019 T IFVATVPVI 0.699 PKC YES
KB0005     [DIFF = -0.004] => 1036 T YFLQTSQQL 0.785 PKC YES
KB204320   [DIFF = -0.004] => 1036 T YFLQTSQQL 0.789 PKC YES
KB0005     [DIFF = -0.004] => 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320   [DIFF = -0.004] => 1037 S FLQTSQQLK 0.589 DNAPK YES
KB0005     [NO_REPETITION] => 1045 S KQLESEGRS 0.669 PKC YES
$
$ # All lines except the last two occur in pairs
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt | sed -n 1,8p
KB0005 1019 T IFVATVPVI 0.691 PKC YES
KB204320 1019 T IFVATVPVI 0.699 PKC YES
KB0005 1036 T YFLQTSQQL 0.785 PKC YES
KB204320 1036 T YFLQTSQQL 0.789 PKC YES
KB0005 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320 1037 S FLQTSQQLK 0.589 DNAPK YES
KB0005 1045 S KQLESEGRS 0.669 PKC YES
KB0005 1045 S KQLESEGRS 0.880 unsp YES
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt | sed -n 1,8p | perl testscr1.pl
KB0005     [DIFF = -0.008] => 1019 T IFVATVPVI 0.691 PKC YES
KB204320   [DIFF = -0.008] => 1019 T IFVATVPVI 0.699 PKC YES
KB0005     [DIFF = -0.004] => 1036 T YFLQTSQQL 0.785 PKC YES
KB204320   [DIFF = -0.004] => 1036 T YFLQTSQQL 0.789 PKC YES
KB0005     [DIFF = -0.004] => 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320   [DIFF = -0.004] => 1037 S FLQTSQQLK 0.589 DNAPK YES
KB0005     [NO_REPETITION] => 1045 S KQLESEGRS 0.669 PKC YES
KB0005     [NO_REPETITION] => 1045 S KQLESEGRS 0.880 unsp YES
$
$ # No line is repeated
$
$ sed -n 1,5p data.txt
KB0005 1019 T IFVATVPVI 0.691 PKC YES
KB0005 1036 T YFLQTSQQL 0.785 PKC YES
KB0005 1037 S FLQTSQQLK 0.585 DNAPK YES
KB0005 1045 S KQLESEGRS 0.669 PKC YES
KB0005 1045 S KQLESEGRS 0.880 unsp YES
$
$ sed -n 1,5p data.txt | perl testscr1.pl
KB0005     [NO_REPETITION] => 1019 T IFVATVPVI 0.691 PKC YES
KB0005     [NO_REPETITION] => 1036 T YFLQTSQQL 0.785 PKC YES
KB0005     [NO_REPETITION] => 1037 S FLQTSQQLK 0.585 DNAPK YES
KB0005     [NO_REPETITION] => 1045 S KQLESEGRS 0.669 PKC YES
KB0005     [NO_REPETITION] => 1045 S KQLESEGRS 0.880 unsp YES
$
$ # Three pairs of lines; no single-occuring line
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt | sed -n 1,6p
KB0005 1019 T IFVATVPVI 0.691 PKC YES
KB204320 1019 T IFVATVPVI 0.699 PKC YES
KB0005 1036 T YFLQTSQQL 0.785 PKC YES
KB204320 1036 T YFLQTSQQL 0.789 PKC YES
KB0005 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320 1037 S FLQTSQQLK 0.589 DNAPK YES
$
$ sort -k2,2 -k3,3 -k4,4 -k6,6 -k7,7 data.txt | sed -n 1,6p | perl testscr1.pl
KB0005     [DIFF = -0.008] => 1019 T IFVATVPVI 0.691 PKC YES
KB204320   [DIFF = -0.008] => 1019 T IFVATVPVI 0.699 PKC YES
KB0005     [DIFF = -0.004] => 1036 T YFLQTSQQL 0.785 PKC YES
KB204320   [DIFF = -0.004] => 1036 T YFLQTSQQL 0.789 PKC YES
KB0005     [DIFF = -0.004] => 1037 S FLQTSQQLK 0.585 DNAPK YES
KB204320   [DIFF = -0.004] => 1037 S FLQTSQQLK 0.589 DNAPK YES
$
$ # Only one line
$
$ head -1 data.txt
KB0005 1019 T IFVATVPVI 0.691 PKC YES
$
$ head -1 data.txt | perl testscr1.pl
KB0005     [NO_REPETITION] => 1019 T IFVATVPVI 0.691 PKC YES
$
$

HTH,
tyler_durden
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare bash arrays issue

Hello everyone, I need help comparing 2 arrays. the first array is static; the second array is not .. array1=( "macOS Mojave" "iTunes" ) cd /Volumes array2=( * ) # output of array2 macOS Mojave iTunes Mac me The problem occurs when I compare the arrays with the following code - ... (6 Replies)
Discussion started by: trexthurman
6 Replies

2. Shell Programming and Scripting

Using Diff to compare 2 arrays

I have two arrays and they look like this: array=(`cat /local/mnt/*sys/*includes|grep -v NEW`) array2=(`cat /tmp/*sys.z |grep -v NEW`) I am trying to compare them but I need to use the diff -u command. I am not sure how to do this. I cannot just do diff -u ${array} ${array2} I cannot... (4 Replies)
Discussion started by: newbie2010
4 Replies

3. Shell Programming and Scripting

Compare two arrays

Hi, I am trying to compare two lists that are held in two variables so I believe I need to access the array elements to compare these. I am using ksh 88 and the code I have tried is below: for file in ${origfilelist} do if ]] then print -- "File ${file}... (3 Replies)
Discussion started by: frodo61
3 Replies

4. Shell Programming and Scripting

perl: compare two arrays

Hi friends, I want to compare two arrays and find matched one using perl? Also, I want to delete unmatched one. Plz suggest me solution (1 Reply)
Discussion started by: Renesh
1 Replies

5. Shell Programming and Scripting

Perl Compare 2 Arrays

Hello, Consider the following 2 arrays: Array1 = qw(Fa0/0 Fa0/1 Fa0/2 Fa0/3); Array1 = qw(Fa0/1 Fa0/2 Fa0/3 Fa0/4); I want to compare the following 2 arrays as follows: Take specific action when elements of Array1 that doesn't exist in Array2 (in my example: Fa0/0). Take another... (4 Replies)
Discussion started by: ahmed_zaher
4 Replies

6. Shell Programming and Scripting

Compare arrays (perl)

Hi, my first post here! Description of my problem: I have one txt-file with six rows and each row contains seven numbers seperated with whitespaces. I want to: Compare one array with seven numbers with each row of numbers in the txt-file. I have managed to compare one array with... (6 Replies)
Discussion started by: mjoh
6 Replies

7. Shell Programming and Scripting

Compare arrays in perl

Hello, Let's say that we have the two following arrays @array1= @array2= Is there any easy way to compare these two arrays and print the values that exist in array1 and not in array2 and the values that exist in array2 and not in array1? Regards, Chriss_58 (3 Replies)
Discussion started by: chriss_58
3 Replies

8. Shell Programming and Scripting

compare/match arrays

Hi there all, I am having a question. Is it posible to compare elements of 2 different arrays? For example I got Array 1 | Array 2 123_abc | 123_bcd 123_bcd | 234_bcd 234_abc | 567_abc 234_bcd | 123_abc than the match is 123_abc & 234_bcd and non of the others. So... (3 Replies)
Discussion started by: draco
3 Replies

9. Shell Programming and Scripting

Perl - Compare 2 Arrays

Hi all, I have the following script where the contents of file1 and file2 would be something like this: file1: 56790,0,0,100998765 89756,0,0,100567876 867645,1,3,678777654 file2: 56790,0,0,100998765 65776,0,0,4766457890 +5896,0,0,675489876 What I then want to do is check if... (4 Replies)
Discussion started by: Donkey25
4 Replies

10. Shell Programming and Scripting

Compare two arrays in sh or compare two fields

I want a soultion to compare two arrays in sh with an easy way.I want a solution to synchrose users between different AIX servers where no NIS is available. All users are meant to be same on all 10 servers. So the approach is to consider first server as master user repository and whatever the users... (0 Replies)
Discussion started by: rijeshpp
0 Replies
Login or Register to Ask a Question