awk- looping through groups of lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk- looping through groups of lines
# 1  
Old 05-06-2011
awk- looping through groups of lines

Hello,

I'm working with a file that has three columns. The first one represents a certain channel and the third one a timestamp (second one is not important). Example input is as follows:


Code:
2513   12   10.771
 2513   13   10.771
 2513   14   10.771
 2513   15   10.771
 2644    8    10.771
 2645   14    10.771
 2647     7    10.771
----------------------
 2513     0    10.772
 2513     1    10.772
 2513     2    10.772
 2513     3    10.772
 2513     4    10.772
 2513     5    10.772
 2513     6    10.772
----------------------
 2513     7    10.772
 2513     8    10.772
 2513     9    10.772
 2513     10    10.772
 2513     11  10.772
 2513     12   10.772
 2513     13   10.772



The input doesn't have the "----------------------" part, I just put it there so the groups of lines that I want to analyze become a bit clearer.


I want to analyze the lines by groups of 7 (since 7 same timestamps represent 1 packet). The problem is that the timestamps repeat themselves from time to time, so for example sometimes you might find 14 or 21 consecutive timestamps with the same value (even though values in the other two columns do vary)
. What I want to get is a count of the times that the first column values (channels) appear (only counted once per packet, so, every group of 7 lines).

Desired output:

Code:
2513 3
2644 1
2645 1
2647 1



The code I've tried so far doesn't consider the repeated fields (the groups of 7), so it only counts one time per timestamp (which means I get a value of 2 instead of 3 for channel 2513):

Code:
 awk '{ 
                          while (getline > 0 && NF > 0){
                           timec= $3;
                           pidc= $1;
                           if(timec == $3 && pidc != pidp){
                               pid[$1]++;
                             }
                           pidp=$1}
                           } 
                           END {for (i in pid){ print i, pid[i]}}'



Any help is much appreciated.
Thanks!

Last edited by acsg; 05-06-2011 at 08:41 AM.. Reason: clearer input v1.2
# 2  
Old 05-06-2011
I think you want the first line to say:
Code:
2513 4

Also post desired output for the rest of that sample data (10.772 timestamp).
# 3  
Old 05-06-2011
Quote:
Originally Posted by bartus11
I think you want the first line to say:
Code:
2513 4

Also post desired output for the rest of that sample data (10.772 timestamp).

Hello,

The desired output is for the whole input... meaning that I want to count the fact that, for example, channel 2513, appears in all 3 'packets' (groups of 7 lines).
# 4  
Old 05-06-2011
Like this?
Code:
awk '{B[$1]} !(NR%7){for(i in B){delete B[i];A[i]++}} END{for(i in A)print i,A[i]}' infile

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 05-06-2011
Try:
Code:
perl -lane '$x=int(($.-1)/7);$a{$x}{$F[0]}=1;END{for $i (keys %a){for $j (keys %{$a{$i}}){$b{$j}++}};for $i (keys %b){print "$i $b{$i}"}}' file

This User Gave Thanks to bartus11 For This Post:
# 6  
Old 05-09-2011
Quote:
Originally Posted by Scrutinizer
Like this?
Code:
awk '{B[$1]} !(NR%7){for(i in B){delete B[i];A[i]++}} END{for(i in A)print i,A[i]}' infile



Thank you!! This seems to do the trick but I don't quite understand how it does it... could you please explain what the !(NR%7) is for? and why did you use the 'delete' ?
# 7  
Old 05-09-2011
Hi, here is a clarification:

{B[$1]}Create an array element B[$1] . If such an element already exists then this will not create a new element, hence an element will only be created once for the value $1, irrespective of the number of occurrences of $1 (in a group of 7, see below)
!(NR%7)If the remainder of the line number divide by 7 equals 0 (if it is not greater than 1) then we are at a multiple of 7, so seven lines will have been read)
for(i in B){delete B[i];A[i]++then for each element in B increase the count in array A and then discard the array element B[i]. Afterwards all elements in array B will have been discarded. This sequence gets repeated after every 7 lines.
END{for(i in A)print i,A[i]} print all the array element in array A and their values
Hope this helps...

S.
This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Rearrange groups of lines from several files

I have three files as an input and I need to rearrange this input to match the rules by which the processing program consumes the data. My files are: /tmp$ cat F # file -1- FS00|0|zero-zero| FSTA|0|10| FSTA|0|12| FSTA|0|15| FSTA|0|17| FS00|3|negative| FSTA|3|-1| FS00|5|regular|... (2 Replies)
Discussion started by: migurus
2 Replies

2. Shell Programming and Scripting

Best way to sort file with groups of text of 4-5 lines by the first one

Hi, I have some data I have taken from the internet in the following scheme: name direction webpage phone number open hours menu url book url name ... Of course the only line that is mandatory is the name wich is the one I want to sort by. I have the following sed & awk script that... (3 Replies)
Discussion started by: devmsv
3 Replies

3. Shell Programming and Scripting

Print values within groups of lines with awk

Hello to all, I'm trying to print the value corresponding to the words A, B, C, D, E. These words could appear sometimes and sometimes not inside each group of lines. Each group of lines begins with "ZYX". My issue with current code is that should print values for 3 groups and only is... (6 Replies)
Discussion started by: Ophiuchus
6 Replies

4. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

I have two files. File 1 is a two-column index file, e.g. comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+) comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+) File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

5. Shell Programming and Scripting

Help on looping using awk

I have the data like this: PONUMBER,SUPPLIER,LINEITEM,SPLITLINE,LINEAMOUNT,CURRENCY IR5555,Supplier1,1,1,83.1,USD IR5555,Supplier1,1,3,40.4,USD IR5555,Supplier1,1,6,54.1,USD IR5555,Supplier1,1,8,75.1,USD IR5556,Supplier2,1,1,41.1,USD IR5556,Supplier2,1,3,43.1,USD ... (3 Replies)
Discussion started by: jeffreybsu
3 Replies

6. Shell Programming and Scripting

Looping through only blank lines of a file.

I am sorry if I am posting in wrong thread. Experts, I have 2 files File 1 File 2 line1 line1 | line2 line2 | group 1 line3 line3 | line1 line1 | line2 ... (6 Replies)
Discussion started by: suraj.sheikh
6 Replies

7. UNIX for Dummies Questions & Answers

Remove groups of repeating lines

I know uniq exists, but am not sure how to remove repeating lines when they are groups of two different lines repeating themselves, without using sort. I need them to be sorted in the original order, just to remove repeats. cd /media/AUDIO/WAVE/9780743518673/mp3 ~/Desktop/mp3-to-m4b... (1 Reply)
Discussion started by: glev2005
1 Replies

8. UNIX for Dummies Questions & Answers

Help with AWK looping

I'm trying to parse a configuration text file using awk. The following is a sample from the file I'm searching. I can retrieve the formula and recipe names easily but now I want to take it one step farther. In addition to the formula name, I would like to also get the value of the attribute... (6 Replies)
Discussion started by: new2awk
6 Replies

9. UNIX for Dummies Questions & Answers

Help in Array looping and creating multiple lines

hi Gurus, I'm a newbie in scripting please check my script if this is correct. I think there's something wrong with it but I;m not sure. I'm trying to create multiple lines using awk from external xml files but i want to add additonal info in the data manually Since i don't knwo how to... (0 Replies)
Discussion started by: sexyTrojan
0 Replies

10. Shell Programming and Scripting

Breaking long lines into (characters, newline, space) groups

Hello, I am currently trying to edit an ldif file. The ldif specification states that a newline followed by a space indicates the subsequent line is a continuation of the line. So, in order to search and replace properly and edit the file, I open the file in textwrangler, search for "\r " and... (14 Replies)
Discussion started by: rowie718
14 Replies
Login or Register to Ask a Question