Sort file in perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort file in perl
# 1  
Old 06-20-2002
Sort file in perl

Hi,
I have an entry file for a perl script from which I need to remove duplicate entry.
For example:

one:two:three
one:four:five
twoSmiliene:three

must become :

one:two:three
twoSmiliene:three

The duplicate entry is only the first field. I try many options of sort system command but don't find how do this.
Is someone can help ?
Thks
# 2  
Old 06-20-2002
So simple, even *I* might be able to help with this one.

When I need to remove duplicate entries in a text file I use the Perl Hash to get the job done.

Here is some sample code that I pulled from a previous message that I responded to. [Hint: Forum search is your friend]

Code:
#!/usr/bin/perl

# RemoveDupes.pl
# Auswipe 21 Feb 2002
# Auswipe sez: "Hey, no guarantees!"
# Usage:
#
#	RemoveDupes.pl -file someTextFile

use Getopt::Long;
GetOptions("file=s");

my %dataHash    = ();
my $currentLine = 0;

if ($opt_file) {
  open(INPUTFILE, "$opt_file") || die "Error: $!";

  while ($logEntry = <INPUTFILE> ) {
    chomp($logEntry);

    if (!exists($dataHash{$logEntry})) {
      $dataHash{$logEntry} = $currentLine;
    };

    $currentLine++;
  };
  
  close($opt_file);

} else {
  print STDOUT "You didn't select a file!\n";
};

foreach $logOutput (sort { $dataHash{$a} <=> $dataHash{$b} } (keys(%dataHash))) {
  print STDOUT "$logOutput\n";
};

# 3  
Old 06-20-2002
D'oh!

I was re-reading your message and I see that you need to remove duplicates based upon the FIRST field of the colon seperated values.

That makes it a bit tricker but I'll see what I can do to help. The previous perl script is still good for complete lines of duplicate text.

EDIT: This code might help, however there might be some problems. I sort the removed duplicates on the first pass and then remove dupes based upon the first colon sperated value. This might be a problem for you in your application.

Give it a try and lemme know if it gets the job done.

Code:
#!/usr/bin/perl

# RemoveDupes.pl
# Auswipe 21 Feb 2002
# Auswipe sez: "Hey, no guarantees!"
# Usage:
#
#	RemoveDupes.pl -file someTextFile

use Getopt::Long;
GetOptions("file=s");

my %dataHash    = ();
my $currentLine = 0;

if ($opt_file) {
  open(INPUTFILE, "$opt_file") || die "Error: $!";

  while ($logEntry = <INPUTFILE> ) {
    chomp($logEntry);

    if (!exists($dataHash{$logEntry})) {
      $dataHash{$logEntry} = $currentLine;
    };

    $currentLine++;
  };
  
  close($opt_file);

} else {
  print STDOUT "You didn't select a file!\n";
};

my %secondHash = ();

foreach $logOutput (sort { $dataHash{$a} <=> $dataHash{$b} } (keys(%dataHash))) {
  my @columns = split(/:/, $logOutput);
  my $firstColumn = $columns[0];

  if (!exists($secondHash{$firstColumn})) {
    $secondHash{$firstColumn} = $logOutput;
  };
};

foreach $firstColumn (sort {$secondHash{$a} <=> $secondHash{$b} } (keys(%secondHash))) {
  print STDOUT "$secondHash{$firstColumn}\n";
};


Last edited by auswipe; 06-20-2002 at 03:30 PM..
# 4  
Old 06-21-2002
Hi, thanks for help but it is not working.
I've got an error:

my %dataHash = (/: unmatched () in regexp line 10

Here is my script:
#!/usr/bin/perl

use Getopt::Long;
GetOption(file=s);

my %dataHash = ();
my $currentline = 0;
$entry = "/var/yp/script/removefile";

if ($entry)
{
open (IN, "$entry") || die "Error: $!";
while ($logentry = <IN>)
{
chomp($logentry);
if(!exists($dataHash($logentry)))
{
$dataHash($logentry) = $currentline;
};
$currentline++;
};
close(IN);
} else {
print "You didn't select a file\n";
};

my %secondHash = ();
foreach $logOutput (sort { $dataHash{$a} <=> $dataHash{$b}} (keys(%dataHash)))
{
my @columns = split (/:/,$logOutput);
my $firstcolumn = $columns[0];
if (!exists($secondHash{$firstcolumn}))
{
$secondHash{$firstcolumn} = $logOutput;
}
}

foreach $firstcolumn (sort { $secondHash{$a} <=> $secondHash{$b} (keys(%dataHash)))
{
print "$secondHash{$firstcolumn}\n";
}

Thanks
# 5  
Old 06-21-2002
Don't search more !!!
I find another way to do the final goal !!!
Thanks for help
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

2. Shell Programming and Scripting

Sort flat file by 3rd column in perl

Hello Guys I want to sort a flat file by the third column (numeric ) and store it in some other name I/P 9924873|20111114|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00 9924873|20111114|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00... (12 Replies)
Discussion started by: Pratik4891
12 Replies

3. Shell Programming and Scripting

Perl SFTP, to get, sort and process every file.

Hi All, I'm niks, and i'm a newbie here and newbie also in perl sorry, i'm just wondering how can i get the file from the other hostname using sftp? then after i get it i'm going to sort the file and process it one by one. sorry because i'm a newbie. Thanks, -niks (4 Replies)
Discussion started by: nikki1200
4 Replies

4. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

5. Shell Programming and Scripting

Perl script to sort an Excel file

Hello! I need to sort a file that is partly in English partly in Bulgarian. The original file is an Excel file but I converted it to a tab-delimited text file. The encoding of the tab delimited file is UTF-8. To sort the text, the script should test every line of the text file to see if... (9 Replies)
Discussion started by: degoor
9 Replies

6. Shell Programming and Scripting

sort file with non ascii chars and cjk with perl

Hello, I am not a programmer, please be patient. Actually, I have started to look into Perl because it seems to be able to solve all the problems (or most of them) I happen meet using my computer. These problems are generally all text-manipulation-related. Although I started to study, I cannot... (6 Replies)
Discussion started by: ahsog
6 Replies

7. Shell Programming and Scripting

Perl function to sort a file based on key fields

Hi, I am new to PERL.I want to sort all the lines in a file based on 1,2 and 4th filelds. Can U suggest me a command/function in perl for this operation.. (5 Replies)
Discussion started by: karthikd214
5 Replies

8. Shell Programming and Scripting

Perl Sort on Text File

Hi, I have a file of names and I want perl to do a sort on this file. How can I sort this list of names using perl? I'm thinking of a command like: @sorted = sort { lc($a) cmp lc($b) } @not_sorted # alphabetical sort The only thing I'm sort of unsure of is, how would I get the name in my... (6 Replies)
Discussion started by: eltinator
6 Replies

9. Shell Programming and Scripting

Perl find::file can I sort the out put

Perl file::find can I sort the out put I am using file::find in my script but how I wish to process each file found in date order. Can I sort this module? eg part of current script is.... use File::Find; # Recursively find all files and directories in $mqueue_directory find(\&wanted,... (2 Replies)
Discussion started by: Andrek
2 Replies

10. Shell Programming and Scripting

sort a file by date using perl

Hello, do any body help me to sort a file by date using perl? thanks in advance Esham (4 Replies)
Discussion started by: esham
4 Replies
Login or Register to Ask a Question