#!/usr/bin/perl -w
use strict;
open (IN, "<data.in") || die "Cannot open data.in: $!\n";
my @lines = <IN>;
close (IN);
my @old; # original data items
my @new; # filtered data items
OUTER_LOOP:
foreach my ( $line )
{
my @item = split (/\s+/, $line);
# apply your conditions
if ( item[18] > 0.00 &&
item[18] < 7.00 )
{
# check if we have that item already
# note that $have is an array reference
INNER_LOOP:
foreach my $have ( @new )
{
if ( $item[1] == $have->[1] &&
$item[2] == $have->[2] &&
$item[3] == $have->[3] &&
$item[4] == $have->[4] &&
$item[5] == $have->[5] )
{
# found it. So, replace new array entry
# with the one with larger #19
if ( $item[18] > $have->[18] )
{
# foreach passes array values by reference, so we can
# simply swap the item
$have = \@item;
}
}
# we should not find that item again, right? So,
# finish that inner foreach here
last INNER_LOOP;
}
}
# add similar tests for the other conditions below
else if ( ... )
{
}
} # all lines
open (OUT, ">data.out") || die "Cannot open data.out: $!\n";
foreach my $out ( @new )
{
# print data tab separated
print OUT join ("\t", @{$out});
print OUT "\n";
}
The above is not tested and likely has typos, and intendation is hard to do in a form (sorry) - but I hope you get the idea. If you intent to go that route, let me know if you have trouble with the code etc.
Hi,
I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g.
COL1 COL2 COL3
A 1234 1234
B 3k32 2322
C Xk32 TTT
A NEW XX22
B 3k32 ... (7 Replies)
Hi
I have been struggling with a script for removing duplicate messages from a shared mailbox.
I would like to search for duplicate messages based on the “Message-ID” string within the messages files.
I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
hello all,
I have an input file with four columns like this with a lot of lines
and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Hi,
How can I remove duplicates from a file based on group on other column? for example:
Test1|Test2|Test3|Test4|Test5
Test1|Test6|Test7|Test8|Test5
Test1|Test9|Test10|Test11|Test12
Test1|Test13|Test14|Test15|Test16
Test17|Test18|Test19|Test20|Test21
Test17|Test22|Test23|Test24|Test5
... (2 Replies)
Hi All,
i have input file like below...
CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;;
CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;;
CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Hi ,
Some time i got duplicated value in my files ,
bundle_identifier= B
Sometext=ABC
bundle_identifier= A
bundle_unit=500
Sometext123=ABCD
bundle_unit=400
i need to check if there is a duplicated values or not if yes , i need to check if the value is A or B when Bundle_Identified ,... (2 Replies)
Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed.
example of my file... (3 Replies)
Dear community,
I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns
The data are like this:
Region 23/11/2014 09:11:36 41752
Medio 23/11/2014 03:11:38 4132
Info 23/11/2014 05:11:09 4323... (2 Replies)
Hi,
I have a file with many sections in it. Each section is separated by a blank line.
The first line of each section would determine if the section is duplicate or not.
if the section is duplicate then remove the entire section from the file.
below is the example of input and output.... (5 Replies)
Discussion started by: ahmedwaseem2000
5 Replies
LEARN ABOUT DEBIAN
extutils::packlist
ExtUtils::Packlist(3perl) Perl Programmers Reference Guide ExtUtils::Packlist(3perl)NAME
ExtUtils::Packlist - manage .packlist files
SYNOPSIS
use ExtUtils::Packlist;
my ($pl) = ExtUtils::Packlist->new('.packlist');
$pl->read('/an/old/.packlist');
my @missing_files = $pl->validate();
$pl->write('/a/new/.packlist');
$pl->{'/some/file/name'}++;
or
$pl->{'/some/other/file/name'} = { type => 'file',
from => '/some/file' };
DESCRIPTION
ExtUtils::Packlist provides a standard way to manage .packlist files. Functions are provided to read and write .packlist files. The
original .packlist format is a simple list of absolute pathnames, one per line. In addition, this package supports an extended format,
where as well as a filename each line may contain a list of attributes in the form of a space separated list of key=value pairs. This is
used by the installperl script to differentiate between files and links, for example.
USAGE
The hash reference returned by the new() function can be used to examine and modify the contents of the .packlist. Items may be
added/deleted from the .packlist by modifying the hash. If the value associated with a hash key is a scalar, the entry written to the
.packlist by any subsequent write() will be a simple filename. If the value is a hash, the entry written will be the filename followed by
the key=value pairs from the hash. Reading back the .packlist will recreate the original entries.
FUNCTIONS
new()
This takes an optional parameter, the name of a .packlist. If the file exists, it will be opened and the contents of the file will be
read. The new() method returns a reference to a hash. This hash holds an entry for each line in the .packlist. In the case of old-
style .packlists, the value associated with each key is undef. In the case of new-style .packlists, the value associated with each key
is a hash containing the key=value pairs following the filename in the .packlist.
read()
This takes an optional parameter, the name of the .packlist to be read. If no file is specified, the .packlist specified to new() will
be read. If the .packlist does not exist, Carp::croak will be called.
write()
This takes an optional parameter, the name of the .packlist to be written. If no file is specified, the .packlist specified to new()
will be overwritten.
validate()
This checks that every file listed in the .packlist actually exists. If an argument which evaluates to true is given, any missing
files will be removed from the internal hash. The return value is a list of the missing files, which will be empty if they all exist.
packlist_file()
This returns the name of the associated .packlist file
EXAMPLE
Here's "modrm", a little utility to cleanly remove an installed module.
#!/usr/local/bin/perl -w
use strict;
use IO::Dir;
use ExtUtils::Packlist;
use ExtUtils::Installed;
sub emptydir($) {
my ($dir) = @_;
my $dh = IO::Dir->new($dir) || return(0);
my @count = $dh->read();
$dh->close();
return(@count == 2 ? 1 : 0);
}
# Find all the installed packages
print("Finding all installed modules...
");
my $installed = ExtUtils::Installed->new();
foreach my $module (grep(!/^Perl$/, $installed->modules())) {
my $version = $installed->version($module) || "???";
print("Found module $module Version $version
");
print("Do you want to delete $module? [n] ");
my $r = <STDIN>; chomp($r);
if ($r && $r =~ /^y/i) {
# Remove all the files
foreach my $file (sort($installed->files($module))) {
print("rm $file
");
unlink($file);
}
my $pf = $installed->packlist($module)->packlist_file();
print("rm $pf
");
unlink($pf);
foreach my $dir (sort($installed->directory_tree($module))) {
if (emptydir($dir)) {
print("rmdir $dir
");
rmdir($dir);
}
}
}
}
AUTHOR
Alan Burlison <Alan.Burlison@uk.sun.com>
POD ERRORS
Hey! The above document had some coding errors, which are explained below:
Around line 39:
=back without =over
perl v5.14.2 2010-12-30 ExtUtils::Packlist(3perl)