Sponsored Content
Top Forums Shell Programming and Scripting Perl: filtering lines based on duplicate values in a column Post 302558152 by rangarasan on Friday 23rd of September 2011 01:48:21 AM
Old 09-23-2011
PERL

Hi,

Try this code,

Code:
#! /usr/local/bin/perl
open(FILE,"<File1") or die("unable to open file");
my @mContent = <FILE>;
my %mFinal = ();
foreach ( @mContent )
{
   my $mLine = $_;
   chomp ( $mLine );
   my $mField = (split(/ /,$mLine,999))[0];
   $mFinal{$mField}{"count"}=$mFinal{$mField}{"count"}+1;
   $mFinal{$mField}{"content"}=$mLine;
}
foreach my $mField ( keys %mFinal )
{
   my $mCount = $mFinal{$mField}{"count"};
   if ( $mCount != 10 )
   {
      print "$mFinal{$mField}{'content'}\n";
   }
}


Cheers,
RangaSmilie
This User Gave Thanks to rangarasan For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Filtering duplicate lines

Does anybody know a command that filters duplicate lines out of a file. Similar to the uniq command but can handle duplicate lines no matter where they occur in a file? (9 Replies)
Discussion started by: AreaMan
9 Replies

2. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies

3. Shell Programming and Scripting

Filtering lines for column elements based on corresponding counts in another column

Hi, I have a file like this ACC 2 2 21 aaa AC 443 3 22 aaa GCT 76 1 33 xxx TCG 34 2 33 aaa ACGT 33 1 22 ggg TTC 99 3 44 wee CCA 33 2 33 ggg AAC 1 3 55 ddd TTG 10 1 22 ddd TTGC 98 3 22 ddd GCT 23 1 21 sds GTC 23 4 32 sds ACGT 32 2 33 vvv CGT 11 2 33 eee CCC 87 2 44... (1 Reply)
Discussion started by: polsum
1 Replies

4. UNIX for Dummies Questions & Answers

[SOLVED] remove lines that have duplicate values in column two

Hi, I've got a file that I'd like to uniquely sort based on column 2 (values in column 2 begin with "comp"). I tried sort -t -nuk2,3 file.txtBut got: sort: multi-character tab `-nuk2,3' "man sort" did not help me out Any pointers? Input: Output: (5 Replies)
Discussion started by: pathunkathunk
5 Replies

5. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field. For example: $cat file 4444 4444 4444 4444 9990 2222,7777 6666 2222 ... (3 Replies)
Discussion started by: torchij
3 Replies

6. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

7. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Hi, I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines Command : sort -t'|' -nuk1 file.txt Input : 38376KZ|09/25/15|1.057 38376KZ|09/25/15|1.057 02006YB|09/25/15|0.859 12593PS|09/25/15|2.803... (2 Replies)
Discussion started by: parithi06
2 Replies

8. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

9. UNIX for Beginners Questions & Answers

Filtering based on column values

Hi there, I am trying to filter a big file with several columns using values on a column with values like (AC=5;AN=10;SF=341,377,517,643,662;VRT=1). I wont to filter the data based on SF= values that are (bigger than 400) ... (25 Replies)
Discussion started by: daashti
25 Replies

10. UNIX for Beginners Questions & Answers

Find lines with duplicate values in a particular column

I have a file with 5 columns. I want to pull out all records where the value in column 4 is not unique. For example in the sample below, I would want it to print out all lines except for the last two. 40991764 2419 724 47182 Cand A 40992936 3591 724 47182 Cand B 40993016 3671 724 47182 Cand C... (5 Replies)
Discussion started by: kaktus
5 Replies
PMLOAD(1p)						User Contributed Perl Documentation						PMLOAD(1p)

NAME
pmload - show what files a given module loads at compile time DESCRIPTION
Given an argument of a module name, show all the files that are loaded directly or indirectly when the module is used at compile-time. EXAMPLES
$ pmload IO::Handle /usr/local/devperl/lib/5.00554/Exporter.pm /usr/local/devperl/lib/5.00554/Carp.pm /usr/local/devperl/lib/5.00554/strict.pm /usr/local/devperl/lib/5.00554/vars.pm /usr/local/devperl/lib/5.00554/i686-linux/DynaLoader.pm /usr/local/devperl/lib/5.00554/i686-linux/IO/Handle.pm /usr/local/devperl/lib/5.00554/Symbol.pm /usr/local/devperl/lib/5.00554/i686-linux/IO/File.pm /usr/local/devperl/lib/5.00554/SelectSaver.pm /usr/local/devperl/lib/5.00554/i686-linux/Fcntl.pm /usr/local/devperl/lib/5.00554/AutoLoader.pm /usr/local/devperl/lib/5.00554/i686-linux/IO.pm /usr/local/devperl/lib/5.00554/i686-linux/IO/Seekable.pm $ cat `pmload IO::Socket` | wc -l 4015 $ oldperl -S pmload Tk /usr/lib/perl5/site_perl/Tk/Pretty.pm /usr/lib/perl5/Symbol.pm /usr/lib/perl5/site_perl/Tk/Frame.pm /usr/lib/perl5/site_perl/Tk/Toplevel.pm /usr/lib/perl5/strict.pm /usr/lib/perl5/Exporter.pm /usr/lib/perl5/vars.pm /usr/lib/perl5/site_perl/auto/Tk/Wm/autosplit.ix /usr/lib/perl5/site_perl/auto/Tk/Widget/autosplit.ix /usr/lib/perl5/site_perl/Tk.pm /usr/lib/perl5/i386-linux/5.00404/DynaLoader.pm /usr/lib/perl5/site_perl/auto/Tk/Frame/autosplit.ix /usr/lib/perl5/site_perl/auto/Tk/Toplevel/autosplit.ix /usr/lib/perl5/Carp.pm /usr/lib/perl5/site_perl/auto/Tk/autosplit.ix /usr/lib/perl5/site_perl/Tk/CmdLine.pm /usr/lib/perl5/site_perl/Tk/MainWindow.pm /usr/lib/perl5/site_perl/Tk/Submethods.pm /usr/lib/perl5/site_perl/Tk/Configure.pm /usr/lib/perl5/AutoLoader.pm /usr/lib/perl5/site_perl/Tk/Derived.pm /usr/lib/perl5/site_perl/Tk/Image.pm /usr/lib/perl5/site_perl/Tk/Wm.pm /usr/lib/perl5/site_perl/Tk/Widget.pm NOTE
If the programmers used a delayed "require", those files won't show up. Furthermore, this doesn't show all possible files that get opened, just those that those up in %INC. Most systems have a way to trace system calls. You can use this to find the real answer. First, get a baseline with no modules loaded. $ strace perl -e 1 2>&1 | perl -nle '/^open("(.*?)".* = [^-]/ && print $1' /etc/ld.so.cache /lib/libnsl.so.1 /lib/libdb.so.2 /lib/libdl.so.2 /lib/libm.so.6 /lib/libc.so.6 /lib/libcrypt.so.1 /dev/null $ strace perl -e 1 2>&1 | grep -c '^open.*= [^-]' 8 Now add module loads and see what you get: $ strace perl -MIO::Socket -e 1 2>&1 | grep -c '^open.*= [^-]' 24 $ strace perl -MTk -e 1 2>&1 | grep -c '^open.*= [^-]' 35 SEE ALSO
Devel::Loaded, plxload(1). AUTHORS and COPYRIGHTS Copyright (C) 1999 Tom Christiansen. Copyright (C) 2006-2008 Mark Leighton Fisher. This is free software; you can redistribute it and/or modify it under the terms of either: (a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or (b) the Perl "Artistic License". (This is the Perl 5 licensing scheme.) Please note this is a change from the original pmtools-1.00 (still available on CPAN), as pmtools-1.00 were licensed only under the Perl "Artistic License". perl v5.10.1 2010-02-22 PMLOAD(1p)
All times are GMT -4. The time now is 11:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy