Sponsored Content
Top Forums Shell Programming and Scripting How to compare two text files Post 302578277 by m.d.ludwig on Thursday 1st of December 2011 08:19:04 AM
Old 12-01-2011
Invoke this following perl script as:
Code:
maillistjoiner maillist etradedatafile

Script is:
Code:
#! /usr/bin/perl

use strict;
use warnings;
use File::Basename;

$/ = "\n\n";
$\ = "\n";

my %MAILLIST;
my @FIELDLIST = qw{ etDefaultLogin eradesmtpemail employeeNumber };

# data parse

sub parse($) {
    my $X = {};

    foreach my $entry (split /\n/, $_[0]) {
        my ($key, $value) = $entry =~ m{^\s*([^=]+)=(\S+)};
        next unless defined $key;

        $X->{$key} = $value;
    };

    return $X;
}

# Read maillist

my $list = shift(@ARGV);
die unless defined $list;

open FH, '<', $list;

while (<FH>) {
    my $X = parse($_);

    unless (defined $X->{eruid}) {
    #   print STDERR $list, '(', $., '): missing eruid';
        next;
    }

    unless (defined $X->{eradesmtpemail}) {
        print STDERR $list, '(', $., '): missing eradesmtpemail for ', $X->{eruid};
        next;
    }

    $MAILLIST{$X->{eruid}} = $X->{eradesmtpemail};
}

close FH;

# Read datafile

my $data = shift(@ARGV);
die unless defined $data;

open FH, '<', $data;

while (<FH>) {
    my $X = parse($_);

    unless (defined $X->{etDefaultLogin}) {
    #   print STDERR $data, '(', $., '): missing etDefaultLogin';
        next;
    }

    if (defined $MAILLIST{$X->{etDefaultLogin}}) {
        $X->{eradesmtpemail} = $MAILLIST{$X->{etDefaultLogin}};
    }
    else {
        $X->{eradesmtpemail} = $X->{mail};
    }

    foreach my $key (@FIELDLIST) {
        if (defined $X->{$key}) {
            print $key, '=', $X->{$key};
        }
    }

    print $\;
}

close FH;

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

compare text files

This may be the 3rd time I'm posting this question. I'm so new here that I'm not even sure how to post! I'm trying to compare two files but can't do a line by line comparison so comm and diff are out. I've been told that I would need to use the awk programing language. I've looked up what I... (14 Replies)
Discussion started by: jimmyflip
14 Replies

2. UNIX for Advanced & Expert Users

How to compare two text files in column wise?

Hi All, I have two txt files like this File1: no name ---------- 12 aaaa 23 bbbb 55 cccc File2 dname dno ------------ civil 33 mech 55 arch 66 Now i want to compare col1 from File and col2 from File2, if its match i want fetch all columns from... (3 Replies)
Discussion started by: psiva_arul
3 Replies

3. Shell Programming and Scripting

Compare two text files and Only show the differences

Hi experts, I'mvery new to shell scripting and learning it now currently i am having a problem which may look easy to u :) i have two files File 1: Start :Thu Nov 19 10:33:09 2009 ABCDGFSDJ.txt APDemoNew.ppt APDemoOutline.doc ARDemoNew.ppt ARDemoOutline.doc File 2: Start... (10 Replies)
Discussion started by: CelvinSaran
10 Replies

4. Shell Programming and Scripting

Compare 2 files and output only the different text.

I know the diff does this but it does output more info than just the different text (e.g. $ diff file1 file2 29a30 > /home/alex/Pictures/hello.jpg 1694a1696 > /home/alex/Pictures/hi.jpg ) How can I make it output only /home/alex/Pictures/hello.jpg /home/alex/Pictures/hi.jpg ? thank... (2 Replies)
Discussion started by: hakermania
2 Replies

5. Shell Programming and Scripting

Compare two text files and print matches

Hi, I am looking for a way to compare two text files and print the matches. For example; File1.txt 89473036 78474384 48948408 95754748 47849030 File2.txt 47849030 46730356 16734947 78474384 36340047 Output: (11 Replies)
Discussion started by: lewk
11 Replies

6. Shell Programming and Scripting

To compare the content of two text files

I have two files, sec.txt(1st File) 3172 disp.txt(2nd file) the file name is *********** 45676 Now i want to compare the value in sec.txt file with disp.txt file Excatly i want to compare the value 3172 in first file and 45676 in second file. i want to compare the first line of... (11 Replies)
Discussion started by: rammm
11 Replies

7. UNIX for Dummies Questions & Answers

Compare two text files

Hello guys, I have file1 and file2, two text files containing various lines. I'm trying to find a way to compare file1 and file2: If the first 7 characters of a line in file2 match the first 7 characters of a line in file1, then do not do anything. Print out the lines of file1 (in file3,... (3 Replies)
Discussion started by: bobylapointe
3 Replies

8. Shell Programming and Scripting

Compare two text files and output difference

Hi experts, I am trying to compare two text files and output the difference to another file. I'm not strictly looking for differences in text but additional text at the end of one file that isn't in another, so basically comparing the file 2 against file 1 and printing any additional text to... (9 Replies)
Discussion started by: martin0852
9 Replies

9. Shell Programming and Scripting

Script to compare two text files

i am working on a shell script and need help in the comparing part of it. for e.g. there two text files like this: file1.txt name1 name2 name3 file1 has to be comared with file2 defaultfile.txt name1 name2 name3 name4 and during comparision with defaultfile.txt if... (2 Replies)
Discussion started by: draghun9
2 Replies

10. Shell Programming and Scripting

Compare two fields in text files?

Hi, I have two text files, compare column one in both the files and if it matches then the output should contain the id in column one, the number and the description. Both the files are sorted. Is there a one liner to get this done, kindly help. Thank you File 1: NC_000964 92.33 ... (2 Replies)
Discussion started by: pulikoti
2 Replies
CDB_File(3pm)						User Contributed Perl Documentation					     CDB_File(3pm)

NAME
CDB_File - Perl extension for access to cdb databases SYNOPSIS
use CDB_File; $c = tie %h, 'CDB_File', 'file.cdb' or die "tie failed: $! "; $fh = $c->handle; sysseek $fh, $c->datapos, 0 or die ...; sysread $fh, $x, $c->datalen; undef $c; untie %h; $t = new CDB_File ('t.cdb', "t.$$") or die ...; $t->insert('key', 'value'); $t->finish; CDB_File::create %t, $file, "$file.$$"; or use CDB_File 'create'; create %t, $file, "$file.$$"; DESCRIPTION
CDB_File is a module which provides a Perl interface to Dan Bernstein's cdb package: cdb is a fast, reliable, lightweight package for creating and reading constant databases. Reading from a cdb After the "tie" shown above, accesses to %h will refer to the cdb file "file.cdb", as described in "tie" in perlfunc. Low level access to the database is provided by the three methods "handle", "datapos", and "datalen". To use them, you must remember the "CDB_File" object returned by the "tie" call: $c in the example above. The "datapos" and "datalen" methods return the file offset position and length respectively of the most recently visited key (for example, via "exists"). Beware that if you create an extra reference to the "CDB_File" object (like $c in the example above) you must destroy it (with "undef") before calling "untie" on the hash. This ensures that the object's "DESTROY" method is called. Note that "perl -w" will check this for you; see perltie for further details. Creating a cdb A cdb file is created in three steps. First call "new CDB_File ($final, $tmp)", where $final is the name of the database to be created, and $tmp is the name of a temporary file which can be atomically renamed to $final. Secondly, call the "insert" method once for each (key, value) pair. Finally, call the "finish" method to complete the creation and renaming of the cdb file. Alternatively, call the "insert()" method with multiple key/value pairs. This can be significantly faster because there is less crossing over the bridge from perl to C code. One simple way to do this is to pass in an entire hash, as in: "$cdbmaker->insert(%hash);". A simpler interface to cdb file creation is provided by "CDB_File::create %t, $final, $tmp". This creates a cdb file named $final containing the contents of %t. As before, $tmp must name a temporary file which can be atomically renamed to $final. "CDB_File::create" may be imported. EXAMPLES
These are all complete programs. 1. Convert a Berkeley DB (B-tree) database to cdb format. use CDB_File; use DB_File; tie %h, DB_File, $ARGV[0], O_RDONLY, undef, $DB_BTREE or die "$0: can't tie to $ARGV[0]: $! "; CDB_File::create %h, $ARGV[1], "$ARGV[1].$$" or die "$0: can't create cdb: $! "; 2. Convert a flat file to cdb format. In this example, the flat file consists of one key per line, separated by a colon from the value. Blank lines and lines beginning with # are skipped. use CDB_File; $cdb = new CDB_File("data.cdb", "data.$$") or die "$0: new CDB_File failed: $! "; while (<>) { next if /^$/ or /^#/; chop; ($k, $v) = split /:/, $_, 2; if (defined $v) { $cdb->insert($k, $v); } else { warn "bogus line: $_ "; } } $cdb->finish or die "$0: CDB_File finish failed: $! "; 3. Perl version of cdbdump. use CDB_File; tie %data, 'CDB_File', $ARGV[0] or die "$0: can't tie to $ARGV[0]: $! "; while (($k, $v) = each %data) { print '+', length $k, ',', length $v, ":$k->$v "; } print " "; 4. For really enormous data values, you can use "handle", "datapos", and "datalen", in combination with "sysseek" and "sysread", to avoid reading the values into memory. Here is the script bun-x.pl, which can extract uncompressed files and directories from a bun file. use CDB_File; sub unnetstrings { my($netstrings) = @_; my @result; while ($netstrings =~ s/^([0-9]+)://) { push @result, substr($netstrings, 0, $1, ''); $netstrings =~ s/^,//; } return @result; } my $chunk = 8192; sub extract { my($file, $t, $b) = @_; my $head = $$b{"H$file"}; my ($code, $type) = $head =~ m/^([0-9]+)(.)/; if ($type eq "/") { mkdir $file, 0777; } elsif ($type eq "_") { my ($total, $now, $got, $x); open OUT, ">$file" or die "open for output: $! "; exists $$b{"D$code"} or die "corrupt bun file "; my $fh = $t->handle; sysseek $fh, $t->datapos, 0; $total = $t->datalen; while ($total) { $now = ($total > $chunk) ? $chunk : $total; $got = sysread $fh, $x, $now; if (not $got) { die "read error "; } $total -= $got; print OUT $x; } close OUT; } else { print STDERR "warning: skipping unknown file type "; } } die "usage " if @ARGV != 1; my (%b, $t); $t = tie %b, 'CDB_File', $ARGV[0] or die "tie: $! "; map { extract $_, $t, \%b } unnetstrings $b{""}; 5. Although a cdb file is constant, you can simulate updating it in Perl. This is an expensive operation, as you have to create a new database, and copy into it everything that's unchanged from the old database. (As compensation, the update does not affect database readers. The old database is available for them, till the moment the new one is "finish"ed.) use CDB_File; $file = 'data.cdb'; $new = new CDB_File($file, "$file.$$") or die "$0: new CDB_File failed: $! "; # Add the new values; remember which keys we've seen. while (<>) { chop; ($k, $v) = split; $new->insert($k, $v); $seen{$k} = 1; } # Add any old values that haven't been replaced. tie %old, 'CDB_File', $file or die "$0: can't tie to $file: $! "; while (($k, $v) = each %old) { $new->insert($k, $v) unless $seen{$k}; } $new->finish or die "$0: CDB_File finish failed: $! "; REPEATED KEYS
Most users can ignore this section. A cdb file can contain repeated keys. If the "insert" method is called more than once with the same key during the creation of a cdb file, that key will be repeated. Here's an example. $cdb = new CDB_File ("$file.cdb", "$file.$$") or die ...; $cdb->insert('cat', 'gato'); $cdb->insert('cat', 'chat'); $cdb->finish; Normally, any attempt to access a key retrieves the first value stored under that key. This code snippet always prints gato. $catref = tie %catalogue, CDB_File, "$file.cdb" or die ...; print "$catalogue{cat}"; However, all the usual ways of iterating over a hash---"keys", "values", and "each"---do the Right Thing, even in the presence of repeated keys. This code snippet prints cat cat gato chat. print join(' ', keys %catalogue, values %catalogue); And these two both print cat:gato cat:chat, although the second is more efficient. foreach $key (keys %catalogue) { print "$key:$catalogue{$key} "; } while (($key, $val) = each %catalogue) { print "$key:$val "; } The "multi_get" method retrieves all the values associated with a key. It returns a reference to an array containing all the values. This code prints gato chat. print "@{$catref->multi_get('cat')}"; "multi_get" always returns an array reference. If the key was not found in the database, it will be a reference to an empty array. To test whether the key was found, you must test the array, and not the reference. $x = $catref->multiget($key); warn "$key not found " unless $x; # WRONG; message never printed warn "$key not found " unless @$x; # Correct RETURN VALUES
The routines "tie", "new", and "finish" return undef if the attempted operation failed; $! contains the reason for failure. DIAGNOSTICS
The following fatal errors may occur. (See "eval" in perlfunc if you want to trap them.) Modification of a CDB_File attempted You attempted to modify a hash tied to a CDB_File. CDB database too large You attempted to create a cdb file larger than 4 gigabytes. [ Write to | Read of | Seek in ] CDB_File failed: <error string> If error string is Protocol error, you tried to "use CDB_File" to access something that isn't a cdb file. Otherwise a serious OS level problem occurred, for example, you have run out of disk space. PERFORMANCE
Sometimes you need to get the most performance possible out of a library. Rumour has it that perl's tie() interface is slow. In order to get around that you can use CDB_File in an object oriented fashion, rather than via tie(). my $cdb = CDB_File->TIEHASH('/path/to/cdbfile.cdb'); if ($cdb->EXISTS('key')) { print "Key is: ", $cdb->FETCH('key'), " "; } For more information on the methods available on tied hashes see perltie. BUGS
The "create()" interface could be done with "TIEHASH". SEE ALSO
cdb(3). AUTHOR
Tim Goodwin, <tjg@star.le.ac.uk>. CDB_File began on 1997-01-08. Now maintained by Matt Sergeant, <matt@sergeant.org> perl v5.14.2 2012-06-01 CDB_File(3pm)
All times are GMT -4. The time now is 09:10 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy