Sponsored Content
Top Forums Shell Programming and Scripting Eliminating duplicate lines via specified number of digits Post 302824531 by Fundix on Friday 21st of June 2013 05:02:22 AM
Old 06-21-2013
A Perl solution using 2 hastables :
Code:
#!/usr/bin/perl -w
use strict;

my $cur_dir = $ENV{PWD};
my $filename = "$cur_dir/$ARGV[0]";
my ($record,@fields,$prefix,%peers,$key,%records);

open(FILEIN,"<$filename") or die"open: $!";
while( defined( $record = <FILEIN> ) ) {
  chomp $record;

  @fields=split(/ /,$record);
  $prefix=substr($fields[0],0,3);

  if(! exists( $peers{$prefix} ) || $fields[1] < $peers{$prefix} ) {
    $peers{$prefix} = $fields[1];
    $records{$prefix} = $record;
  }
}
close(FILEIN);

foreach my $key (sort keys(%records)) {
   print "$records{$key}\n";
}

output :
Code:
%./file031.pl file031
123 45.34
345-xx 32.56
888 22.33
949-xx 36.55


Last edited by Fundix; 06-21-2013 at 06:03 AM.. Reason: Missing }
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

restrain the number of digits of a PID

How is it possible under UNIX to restrain the number of digits of the PID number? For instance, we have a product that generates a PID of 7 digits, and we would like to have only 6 digits maximum instead for the PID. Thank you for your help. (1 Reply)
Discussion started by: mlefebvr
1 Replies

2. UNIX for Dummies Questions & Answers

Counting The Number Of Duplicate Lines In a File

Hello. First time poster here. I have a huge file of IP numbers. I am trying to output only the class b of the IPs and rank them by most common and output the total # of duplicate class b's before the class b. An example is below: 12.107.1.1 12.107.9.54 12.108.3.89 12.109.109.4 12.109.6.3 ... (2 Replies)
Discussion started by: crunchtime
2 Replies

3. Shell Programming and Scripting

Count number of digits in a word

Hi all Can anybody suggest me, how to get the count of digits in a word I tried WORD=abcd1234 echo $WORD | grep -oE ] | wc -l 4 It works in bash command line, but not in scripts :mad: (12 Replies)
Discussion started by: ./hari.sh
12 Replies

4. UNIX for Dummies Questions & Answers

Eliminating CR (new lines) from a file.

Hi all, I made a C++ program in dos (in dev-C++) and uploaded it on Solaris box. On opening that file with 'vim' editor i found that there is some extra new lines after each written code line. I tried to find out is the file is in dos or in unix format, with 'file' command,and i got "<file-name>.h:... (4 Replies)
Discussion started by: KornFire
4 Replies

5. Shell Programming and Scripting

number of digits after decimal

Hi All, I have a file of decimal numbers, cat file1.txt 1.1382666907 1.2603107334 1.6118799297 24.4995857056 494.7632588468 560.7633734425 ..... I want to see the output as only 7 digits after decimal (5 Replies)
Discussion started by: senayasma
5 Replies

6. Shell Programming and Scripting

summing the digits of a binary nuMBER

please help me write a perl program to find the difference of 1 and zeros of a 6 digit binary number. eg If input is 111100 expected output +2 if input is 000011 expected output -2 input is 000111 expected output 0 (2 Replies)
Discussion started by: dll_fpga
2 Replies

7. Shell Programming and Scripting

extracting Number variable and the following digits.

HI all, I have output of something like this: crab: ExitCodes Summary >>>>>>>>> 12 Jobs with Wrapper Exit Code : 50117 List of jobs: 1-12 See https:///twiki/something/ for Exit Code meaning crab: ExitCodes Summary >>>>>>>>> 5 Jobs with Wrapper Exit Code : 8001 List of... (20 Replies)
Discussion started by: emily
20 Replies

8. Shell Programming and Scripting

Eliminating duplicate lines

Hello, I am trying to eliminate lines where column #1 is duplicated. If it is a duplicate, the line with the greater value in column #2 should be deleted: file.dat 123 45.34 345 67.22 949 36.55 123 94.23 888 22.33 345 32.56 Desired ouput 123 45.34 949 36.55 888 22.33 345 32.56... (4 Replies)
Discussion started by: palex
4 Replies

9. Shell Programming and Scripting

awk changes to cut number of digits

HCPM1ONDB00014800011800000589009211201 L201307022013070228AUD 00000000031. 000965105800000000000000000000000 MOBITV KEYA ... (4 Replies)
Discussion started by: mirwasim
4 Replies

10. Shell Programming and Scripting

Find number of digits in a word

HI, Can you tell me how to find the number of digits in a word. $cat data.txt +123456ad 87645768 Output should be 6 8 (5 Replies)
Discussion started by: ashwin3086
5 Replies
COMM(1) 						    BSD General Commands Manual 						   COMM(1)

NAME
comm -- select or reject lines common to two files SYNOPSIS
comm [-123i] file1 file2 DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1; lines only in file2; and lines in both files. The filename ``-'' means the standard input. The following options are available: -1 Suppress printing of column 1. -2 Suppress printing of column 2. -3 Suppress printing of column 3. -i Case insensitive comparison of lines. Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines printed in column number three will have one. The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons. ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of comm as described in environ(7). EXIT STATUS
The comm utility exits 0 on success, and >0 if an error occurs. SEE ALSO
cmp(1), diff(1), sort(1), uniq(1) STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2''). The -i option is an extension to the POSIX standard. HISTORY
A comm command appeared in Version 4 AT&T UNIX. BUGS
Input lines are limited to LINE_MAX (2048) characters in length. BSD
January 26, 2005 BSD
All times are GMT -4. The time now is 04:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy