Perl- Finding average "frequency" of occurrence of duplicate lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl- Finding average "frequency" of occurrence of duplicate lines
# 8  
Old 08-09-2011
Just change to
Code:
push @{$seen{"@F[3..16]"}}, $F[0];

It changes spaces to one. If you want to save them you need use substr() on $_. If you want another output separator then change it in END block before any print like this: $\="\t"

"-a" switch splits every input line to @F array. Then we push in an anonymous array the first field and this array is in the hash "seen", where the key is the joined array slice. In the END block we for every unique key counts how many time fields there is, how seconds total between them and evaluate average.

---

Sorry, I'm wrong about the output separator. If you want change it, you need change in BEGIN block $" variable.
Code:
perl -ane '
  BEGIN {
    $"="\t";
  }
  push @{$seen{"@F[3..16]"}}, $F[0];
  END {
    for $key (sort keys %seen) {
        @ts = @{$seen{$key}};
        $n = @ts;
        $prev = $ts[0];
        $nt = 0;
        print "$key $n ";
      for $time (@ts) {
        $nt += $time - $prev;
      }
      print $nt/$n, "\n";
    }
}' INPUTFILE


Last edited by yazu; 08-09-2011 at 05:34 AM.. Reason: Oops
This User Gave Thanks to yazu For This Post:
# 9  
Old 08-09-2011
Quote:
Originally Posted by yazu
Just change to
Code:
perl -ane '
  BEGIN {
    $"="\t";
  }
  push @{$seen{"@F[3..16]"}}, $F[0];
  END {
    for $key (sort keys %seen) {
        @ts = @{$seen{$key}};
        $n = @ts;
        $prev = $ts[0];
        $nt = 0;
        print "$key $n ";
      for $time (@ts) {
        $nt += $time - $prev;
      }
      print $nt/$n, "\n";
    }
}' INPUTFILE

There is another small problem I found. The record it keeps is static, meaning it should count the seconds since the last appearance, but what it's doing right now is counting the seconds since the FIRST appearance every time. In your example, this makes the seconds since the first 'a' be 2, then 5, then 6 which gives an average of 3.25 and the real average should be made between 2, 3 and 1 (which would give a 1.5 avg).
# 10  
Old 08-09-2011
Change to:
Code:
      for $time (@ts) {
        $nt += $time - $prev;
        $prev = $time;
      }

This User Gave Thanks to yazu For This Post:
# 11  
Old 08-11-2011
Thanks for all your help yazu!! Smilie

Is it possible to do it the other way (keep track of the number of lines between repetitions and then make an avg)?

-----------------
I guess I only have to replace the timestamps with the current input line number in the code, in order to get the average lines Smilie

So then it becomes:

Code:
push @{$seen{"@F[3..16]"}}, $.;

Or so I think! Smilie

Last edited by acsg; 08-11-2011 at 08:47 AM.. Reason: $. instead of $F[0]
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

4. Shell Programming and Scripting

Find lines with "A" then change "E" to "X" same line

I have a bunch of random character lines like ABCEDFG. I want to find all lines with "A" and then change any "E" to "X" in the same line. ALL lines with "A" will have an "X" somewhere in it. I have tried sed awk and vi editor. I get close, not quite there. I know someone has already solved this... (10 Replies)
Discussion started by: nightwatchrenba
10 Replies

5. Shell Programming and Scripting

Cant get awk 1liner to remove duplicate lines from Delimited file, get "event not found" error..help

Hi, I am on a Solaris8 machine If someone can help me with adjusting this awk 1 liner (turning it into a real awkscript) to get by this "event not found error" ...or Present Perl solution code that works for Perl5.8 in the csh shell ...that would be great. ****************** ... (3 Replies)
Discussion started by: andy b
3 Replies

6. Shell Programming and Scripting

finding the strings beween 2 characters "/" & "/" in .txt file

Hi all. I have a .txt file that I need to sort it My file is like: 1- 88 chain0 MASTER (FF-TE) FFFF 1962510 /TCK T FD2TQHVTT1 /jtagc/jtag_instreg/updateinstr_reg_1 dff1 (TI,SO) 2- ... (10 Replies)
Discussion started by: Behrouzx77
10 Replies

7. Solaris

"Load Average" vs "virtual processor"

Hi, I have one question regarding the understanding of “load average” in a platform with virtual processors. Suppose in this situation: Total number of physical processors: 1 Number of virtual processors: 32 Total number of cores: 4 Number of cores per physical... (1 Reply)
Discussion started by: MDING
1 Replies

8. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies
Login or Register to Ask a Question