Aggregate data within the file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Aggregate data within the file
# 8  
Old 11-25-2015
Code:
cat venky338.file2

ord,line,Date,Hour,count,appr
1217,1,11/19/2015,"00",8,""
1217,1,11/19/2015,"08",4,""
1217,1,11/19/2015,"10",62,""
1217,1,11/19/2015,"09",20,""
1217,1,11/19/2015,"00",24,""
1217,1,11/19/2015,"10",42,""

Code:
perl -anlF"," -e '$.==1?{print}:{@{$r{join ",", @F[0,1,2,3]}}[0]+=$F[4]}; END{$,=",";for(sort keys %r){print $_,$r{$_}[0],qq("")}}' venky338.file2

or
Code:
perl -nle '/(.+,)(.+,.+)/;$.==1?{print}:{@{$r{$1}}[0]+=$2}; END{for(sort keys %r){print "$_$r{$_}[0],\"\""}}' venky338.file2

Code:
ord,line,Date,Hour,count,appr
1217,1,11/19/2015,"00",32,""
1217,1,11/19/2015,"08",4,""
1217,1,11/19/2015,"09",20,""
1217,1,11/19/2015,"10",104,""

Brief explanation of last version:
Code:
perl # Perl binary
-nle # equivalent to -n -l -e
-n # loop through each line of the file
-l # remove the newline and append it to any print
-e # process the following as Perl code

/(.+,)(.+,.+)/; # divide the line into an identification token and a portion that needs to be added

$.==1?{print}:{@{$r{$1}}[0]+=$2}; # if it is the first line just print it otherwise use the first token as the key of a hash and add the second token as its value

END{for(sort keys %r){print "$_$r{$_}[0],\"\""}} # sort and display every unique record


Last edited by Aia; 11-25-2015 at 02:25 AM..
# 9  
Old 11-25-2015
Note that for any of the solutions suggested with sorted output, the output won't be sorted by the ord field unless all ord values have the same number of digits; won't be sorted by line within ord unless all line values are single digits; and won't be sorted by date within ord and line unless all dates are for the same year, all months are presented as two digits (which we can't tell from the sample given), and all days are presented as two digits (which we can't tell from the sample given). But, as long as the hour field is always represented with two digits surrounded by double quotes, the hour value fields for any given ord, line, and date triple will be in sorted order.

With the awk suggestion I provided in post #7, the output order is random (but the count will be aggregated into a single output line for each input ord, line, date, and hour quadruple whether or not the input is sorted).

If you do need output that is sorted with ord value as the primary key, line value as the secondary key, date as the tertiary key, and hour as the quaternary key; we need more details about the formats of the first three fields.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

IPMP over aggregate in Solaris 11

hi all, i start with solaris 11 and i am disapointed by the change on ip managing. i want to set a ipmp over tow aggregate but i dont find any doc and i am lost with the new commande switch1 net0 aggregate1 | net1 aggregate1 |-----| |... (1 Reply)
Discussion started by: sylvain
1 Replies

2. Shell Programming and Scripting

Aggregate variables bdfore ssh into remote host

Hi all, I have a problem where i'm trying to identify on which remote hosts the apps are running, ssh into it and restart them. In case more than 1 apps is running on same remote host, i want to be able to group it and ssh only once. E.g: app1 = 1.1.1.1 app2 = 1.1.1.2 app3 =... (4 Replies)
Discussion started by: varu0612
4 Replies

3. Shell Programming and Scripting

simple aggregate task

Hi experts, I need an help on the task below. INPUT: values separated by the tab,first row is the header 20110609 AS A 300.5000 20110609 AS R 200.5000 20110609 BR A 111.5000 20110609 BR R 222.5000 20110610 AS A 100.5500 20110610 AS ... (2 Replies)
Discussion started by: hernand
2 Replies

4. Shell Programming and Scripting

Awk Multiple Files & Aggregate

file 1: 70|236|PPS|0501011818|mms|20090706|001452|00000024|2|0000000000000000|00000|0000000000|0000000000|40948000|1 70|236|PPS|0501020076|mms|20090705|204408|00000019|2|0000000000000000|00000|0000000000|0000000000|40947930|1... (3 Replies)
Discussion started by: magedfawzy
3 Replies

5. IP Networking

Aggregate two internet connections

Hi I have a question related to load balancing.I have two separate internet connections with 2Mbps speed and i would like to aggregate this two connections intro one connection with 4Mbps.Is it possible to do that, to put a Linux or Unix machine as a gateway?I read some stuff to split the... (3 Replies)
Discussion started by: tafil
3 Replies

6. UNIX for Advanced & Expert Users

AWK aggregate records

Hy all, I have a problem...can some one help me... I have a file of records sort: 30|239|ORD|447702936929 |blackberry.net |20080728|141304|00000900|2|0000000000000536|28181|0000000006|0000000001|10|1 30|239|ORD|447702936929 |blackberry.net ... (4 Replies)
Discussion started by: anaconga
4 Replies

7. UNIX Desktop Questions & Answers

Aggregate title to an archive.log

Hello how are you, i have a question i have a file ale.log and i want to agregate a title and later a space when the text is over and put another title (when the text is over) how can i do this? thank you Example Last ------>(Title) i want to agregate pupu pupu pupu pupu... (1 Reply)
Discussion started by: enkei17
1 Replies

8. UNIX for Dummies Questions & Answers

Aggregate values in a file & compare with sql output

Hi, I have a file containing the following data: junk123junk723itemcode001qty01price10total10junkjunk junk123junk723itemcode002qty02price10total20junkjunk .. .. .. could be 5000+ lines I have an algo and need a code to implement this: 1. Linecount = wc -l (should give 5000) 2. For i... (1 Reply)
Discussion started by: shiroh_1982
1 Replies

9. UNIX for Dummies Questions & Answers

aggregate ethernet ports under Solaris

I have been looking for info on how to aggregate 2 ore 3 NIC's into into one big pipe. Any advice would be appreciated. -Chuck (4 Replies)
Discussion started by: 98_1LE
4 Replies
Login or Register to Ask a Question