Sorry about the lame test data, it was very generic...I didn't want to disclose email addresses of our customers. Thats probably why your end version of the script didn't provide consistent output. I was testing with different data of course, so maybe my next script should be one which parses a log and scrambles the email and IP addresses, so I can provide more representative test data when posting.

Anyhow, ERA, your advice was great and just what I needed to help guide me through this.
Here is the code that I ended up with after taking some of your suggestions and adding a hash to get rid of duplicates.
Code:
#!/usr/local/bin/perl
my %sender_emails = () ;
my %recipient_emails = () ;
my %recipient_count = () ;
my %uniqueEntries = ();
# location of logfile
$logfile = '/data/log/maillog';
open(LOG, $logfile);
while (<LOG>)
{
($msgMon, $msgDay, $msgTime, $msgHost, $msgCmd, $QID, $from_to) = split(/\s+/, $_) ;
next if (/from=<>/) ;
next if (/from=<root>/) ;
if (($_ =~ /from=</) && ($_ =~ /qmgr/))
{
($tmpString, $from) = split("from=<", $_);
($from,$tmpString) = split(">", $from);
$sender_emails {$QID} = $from;
}
elsif (($_ =~ /to=</) && ($_ =~ /smtp/))
{
($tmpString, $to) = split("to=<", $_);
($to,$tmpString) = split(">", $to);
$recipient_emails {$QID} = $recipient_emails {$QID} . "$to " ;
$recipient_count {$QID}++ ;
}
}
close(LOG);
foreach $myQID (keys %sender_emails)
{
my %uniqueRcpt = () ;
$myto = $recipient_emails{$myQID} ;
$myfrom = $sender_emails{$myQID} ;
$tocount = $recipient_count{$myQID} ;
next if $tocount >= 6;
@rcpt = split(/\s+/, $myto);
foreach $myrcpt(@rcpt)
{
$uniqueRcpt {$myrcpt} = $myfrom;
}
foreach $myrcpt (keys %uniqueRcpt)
{
$myfrom = $uniqueRcpt{$myrcpt} ;
$senderRcptKey = $myfrom . ":" . $myrcpt ;
$uniqueEntries{$senderRcptKey} = 1 ;
}
}
$outbound_emails = '/data/whitelisting/outbound_emails';
open(OBE,">$outbound_emails");
foreach $myPair (keys %uniqueEntries)
{
print $myPair . "\n" ;
print OBE $myPair . "\n" ;
}
close(OBE);
Its perhaps not the most efficient script but it runs in the middle of the night on mail machines which are behind a load balancer. So I got away with it for now, but I will certainly try to update it when my
perl scripting skills have improved. Thanks again for all your help!
