Compare files


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Compare files
# 15  
Old 03-13-2007
Here's a Python alternative:
Code:
for line1 in open("file1"):
        line1 = line1.strip().split(" ",1)
        f1col = line1[1:][0].split()
        print
        print line1[0],
        for line2 in open("file2"):
                count =0
                line2 = line2.strip().split(" " ,1)
                for item1 in f1col :
                        for item2 in line2[1:][0].split():
                                if item1 == item2 : count+=1
                if count == len(f1col): print line2[0],

output:
Code:
# ./test.py

mary role1 role3
anne role4
jane role2 role5
sam role1 role3 role4

and
Code:
# ./test.py

mary role2
anne role2

# 16  
Old 03-13-2007
What if I want to compare the files the other way around? I can reverse the two files but it will give too many fields output based from my current file and i still have to further arrange the data to get my desired output (almost 23,000 rows for file1 record). It should be all values from file2 that is present in file1. (previous request was for all values of files in file2). Both were needed to get the desired output for my records.

File1
mary a b c d
anne e f g h
jane a d e
sam g h

File2
role1 a b
role2 a b c
role3 g h
role4 a e
role5 e f g

Output
mary role1 role2
anne role3 role5
jane role4
sam role3

Would appreciate it if the code is in korn or perl. Thanks in advance...
# 17  
Old 03-13-2007
tested and it works fine! Smilie

Try this!

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", secondfile) || die ("Unable to open secondfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr; $i++ ) {
    $dump .= ( $split_arr[$i] . ":");
  }
  $dump =~ s/:$//;
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", firstfile) || die ("Unable to open firstfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @first_arr = split(/ /, $_);
  print "$first_arr[0] ";
  foreach my $key ( keys %fileHash ) {
    @second_arr = split(/:/, $fileHash{$key});
    for($i = 0; $i <= $#second_arr; $i++ ) {
      $set = 0;
      for( $j = 1; $j <= $#first_arr; $j++ ) {
        if( $first_arr[$j] =~ $second_arr[$i] ) {
          $set = 1;
          last;
        }
      }
      last if( $set == 0 )
    }
    print "$key " if( $set == 1 )
  }
  print "\n";
}

close(FILE);

exit 0

# 18  
Old 03-13-2007
Quote:
Originally Posted by matrixmadhan
tested and it works fine! Smilie

Try this!

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", secondfile) || die ("Unable to open secondfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr; $i++ ) {
    $dump .= ( $split_arr[$i] . ":");
  }
  $dump =~ s/:$//;
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", firstfile) || die ("Unable to open firstfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @first_arr = split(/ /, $_);
  print "$first_arr[0] ";
  foreach my $key ( keys %fileHash ) {
    @second_arr = split(/:/, $fileHash{$key});
    for($i = 0; $i <= $#second_arr; $i++ ) {
      $set = 0;
      for( $j = 1; $j <= $#first_arr; $j++ ) {
        if( $first_arr[$j] =~ $second_arr[$i] ) {
          $set = 1;
          last;
        }
      }
      last if( $set == 0 )
    }
    print "$key " if( $set == 1 )
  }
  print "\n";
}

close(FILE);

exit 0

This one works. Tested Smilie
# 19  
Old 03-13-2007
Quote:
Originally Posted by matrixmadhan
run the below as such and let us know the results

I have modified the code

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", secondfile) || die ("Unable to open secondfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr; $i++ ) {
    $dump .= ( $split_arr[$i] . ":");
  }
  $dump =~ s/:$//;
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", firstfile) || die ("Unable to open firstfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i < $#split_arr + 1; $i++ ) {
    $dump .= $split_arr[$i];
  }
  print "$split_arr[0] ";
  foreach my $key ( keys %fileHash ) {
    @diff_arr = split(/:/, $fileHash{$key});
    for( my $i = 0; $i <= $#diff_arr; $i++ ) {
      if( $dump =~ $diff_arr[$i] ) {
        print "$key ";
      }
    }
  }
  print "\n";
}

close(FILE);

exit 0

I found some problem with this one (the reverse of the other). The output repeats.
File1
mary MI_AP MI_RC
anne MI_RC

File2
role1 MI_AP_REC
role2 MI_AP MI_RC

Output of this code:
mary role2 role2
anne role2

Needed output:
mary role2
anne role2

Am i giving you too much problem Smilie ? This one (perl) is really new to me.
# 20  
Old 03-13-2007
Code:
 Code:
#! /opt/third-party/bin/perl

open(FILE, "<", secondfile) || die ("Unable to open secondfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr; $i++ ) {
    $dump .= ( $split_arr[$i] . ":");
  }
  $dump =~ s/:$//;
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", firstfile) || die ("Unable to open firstfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i < $#split_arr + 1; $i++ ) {
    $dump .= $split_arr[$i];
  }
  print "$split_arr[0] ";
  foreach my $key ( keys %fileHash ) {
    @diff_arr = split(/:/, $fileHash{$key});
    for( my $i = 0; $i <= $#diff_arr; $i++ ) {
      if( $dump =~ $diff_arr[$i] ) {
        print "$key ";
      }
    }
  }
  print "\n";
}

close(FILE);

Quote:
Originally Posted by The One
I found some problem with this one (the reverse of the other). The output repeats.
File1
mary MI_AP MI_RC
anne MI_RC

File2
role1 MI_AP_REC
role2 MI_AP MI_RC

Output of this code:
mary role2 role2
anne role2

Needed output:
mary role2
anne role2

Am i giving you too much problem Smilie ? This one (perl) is really new to me.
Sorry bout the confusing username above, i forgot my friend was logged in in my PC and i forgot to change user before replying..
# 21  
Old 03-13-2007
I wonder how do you get the following output,

Quote:
Output of this code:
mary role2 role2
anne role2
Well I accept, the script had bug in it,

but it produced output of the form

Quote:
mary role2
anne
I had fixed it now,

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", secondfile) || die ("Unable to open secondfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr; $i++ ) {
    $dump .= ( $split_arr[$i] . ":");
  }
  $dump =~ s/:$//;
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", firstfile) || die ("Unable to open firstfile. <$!>\n");

while( <FILE> ) {
  chomp;
  @first_arr = split(/ /, $_);
  print "$first_arr[0] ";
  foreach my $key ( keys %fileHash ) {
    @second_arr = split(/:/, $fileHash{$key});
    for($i = 0; $i <= $#second_arr; $i++ ) {
      $set = 0;
      for( $j = 1; $j <= $#first_arr; $j++ ) {
        if( $first_arr[$j] =~ $second_arr[$i] ) {
          $set = 1;
          last;
        }
      }
      last if( $set == 0 && $#first_arr > 1 )
    }
    print "$key " if( $set == 1 )
  }
  print "\n";
}

close(FILE);

exit 0

Verified two sources of input,

hopefully it should work properly now! Smilie

Give it a shot!

Sorry for not being up to the point in the beginning itself!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare files and share output from both files

hi all, Thanks to all for your great help... I have a scenario that I have two files (file1 & file2). I need to compare two files entire row by row and share the output if any discrepancies within two files. File1: DB1|TB1|C1,C3 DB2|TB2|C1,C2 DB3|TB3|C1,C2,C3,C4 File2: ... (2 Replies)
Discussion started by: Selva_2507
2 Replies

2. Shell Programming and Scripting

Compare multiple files, and extract items that are common to ALL files only

I have this code awk 'NR==FNR{a=$1;next} a' file1 file2 which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
Discussion started by: castrojc
7 Replies

3. Shell Programming and Scripting

Compare two files, then overwrite first file with only that in both files

I want to compare two files, and search for items that are in both. Then override the first file with that containing only elements which were in both files. I imagine something with diff, but not sure. File 1 One Two Three Four Five File 2 One Three Four Six Eight (2 Replies)
Discussion started by: castrojc
2 Replies

4. Shell Programming and Scripting

Compare files

Please help me with awk.I have two files with the below details file1 123456789 2012 987654321 2011 a1234567892012 a1234abcde2012 b1234567892012 c1234567892012 98765a12342012 file2 a1234 01234 b1234 33333 I need to check whether the items in file2 is present in file1 .If it is... (2 Replies)
Discussion started by: Mary James
2 Replies

5. Shell Programming and Scripting

Require compare command to compare 4 files

I have four files, I need to compare these files together. As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes. Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies

6. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

7. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies

8. Shell Programming and Scripting

compare files in two directories and output changed files to third directory

I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together. I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Discussion started by: bkeep
4 Replies

9. Shell Programming and Scripting

compare two files and to remove the matching lines on both the files

I have two files and need to compare the two files and to remove the matching lines from both the files (4 Replies)
Discussion started by: shellscripter
4 Replies

10. Shell Programming and Scripting

compare two files

I have file1 and file2: file1: 11 xxx kksd ... 22 kkk kdsglg... 33 sss kdfjdksa... 44 kdsf dskjfkas ... hh kdkf kdkkd.. jg dkf dfkdk ... ... file2: jg 22 hh ... I need to check each line of file1. if the field one is in file2, I will keep it; if not, the whole line will be... (17 Replies)
Discussion started by: fredao
17 Replies
Login or Register to Ask a Question