Aggregate data within the file

11-23-2015

Registered User

12, 2

Join Date: Feb 2012

Last Activity: 30 August 2016, 9:05 PM EDT

Posts: 12

Thanks Given: 3

Thanked 2 Times in 1 Post

Aggregate data within the file

Guys,
I need to roll up data within the file and build a new file with the output and the same format as the original file.

The data should be rolled up for each unique combination of ord,line,date and hour.. The last column appr is always " "
Below is the format

Original File:

Code:

ord,line,Date,Hour,count,appr
1217,1,11/19/2015,"00",8,""
1217,1,11/19/2015,"00",24,""
1217,1,11/19/2015,"08",4,""
1217,1,11/19/2015,"09",20,""
1217,1,11/19/2015,"10",42,""
1217,1,11/19/2015,"10",62,""

New File:

Code:

ord,line,Date,Hour,count,appr
1217,1,11/19/2015,"00",32,""
1217,1,11/19/2015,"08",4,""
1217,1,11/19/2015,"09",20,""
1217,1,11/19/2015,"10",104,""

Any thoughts are appreciated

Last edited by jim mcnamara; 11-23-2015 at 06:04 PM..

venky338

View Public Profile for venky338

Find all posts by venky338

11-23-2015

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Perhaps something like:

Code:

awk '
function printlast() {
	if(last)
		print last, count, appr
	count = 0
}
BEGIN {	FS = OFS = ","
}
FNR == 1 {
	print
	next
}
last != $1 OFS $2 OFS $3 OFS $4 {
	printlast()
	last = $1 OFS $2 OFS $3 OFS $4
	appr = $6
}
{	count += $5
}
END {	printlast()
}' file

would do what you want.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

11-23-2015

Registered User

12, 2

Join Date: Feb 2012

Last Activity: 30 August 2016, 9:05 PM EDT

Posts: 12

Thanks Given: 3

Thanked 2 Times in 1 Post

Don,
Thank you very much for the quick response.It works like a charm. I don't have a whole lot of experience using awk.If you don't mind can you explain what the code is doing at the line level?
Thanks again!

venky338

View Public Profile for venky338

Find all posts by venky338

11-23-2015

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Does this help?:

Code:

awk '
# Define function to print results for last aggregated data.
function printlast() {
	# If "last" is not empty (which will be true except for the first time
	# this function is called)...
	if(last)
		# print the results.
		print last, count, appr
	# Clear the accumulated count.
	count = 0
}
# Before reading any input files, set the input and output field separators to a
# comma.
BEGIN {	FS = OFS = ","
}
# When we are looking at the 1st line in an input file...
FNR == 1 {
	# copy the header line to the output...
	print
	# and skip to the next input line without execcuting the remaining lines
	# of this script for this line.
	next
}
# If "last" does not match the first four fields of the current input line...
last != $1 OFS $2 OFS $3 OFS $4 {
	# print the accumulated data for the previous line...
	printlast()
	# set "last" to the first four fields of the current input line...
	last = $1 OFS $2 OFS $3 OFS $4
	# and, set "appr" to the last field on this line.
	appr = $6
}
# Add the count from the current line to the total for lines matching "last".
{	count += $5
}
# After we have processed all lines from the input file, print the accumulated
# data for the last set of aggregated data.
END {	printlast()
}' file

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

11-24-2015

Registered User

12, 2

Join Date: Feb 2012

Last Activity: 30 August 2016, 9:05 PM EDT

Posts: 12

Thanks Given: 3

Thanked 2 Times in 1 Post

This really helps!
Thanks

---------- Post updated 11-24-15 at 12:19 PM ---------- Previous update was 11-23-15 at 08:05 PM ----------

It looks the awk command has an issue if the repeated lines are not in the order.
Ex..Running the awk command on the below file doesn't work

Code:

ord,line,Date,Hour,count,appr
1217,1,11/19/2015,"00",8,""
1217,1,11/19/2015,"08",4,""
1217,1,11/19/2015,"10",62,""
1217,1,11/19/2015,"09",20,""
1217,1,11/19/2015,"00",24,""
1217,1,11/19/2015,"10",42,""

Any thoughts?

Last edited by Franklin52; 11-24-2015 at 03:26 PM.. Reason: Please uwe code tags

venky338

View Public Profile for venky338

Find all posts by venky338

11-24-2015

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

How about prepending { read X; echo $X; sort; } < file | ?

RudiC

View Public Profile for RudiC

Find all posts by RudiC

11-24-2015

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

If the output order doesn't matter, you could use this simpler awk script:

Code:

awk '
# Before reading any input files, set the input and output field separators to a
# comma.
BEGIN {	FS = OFS = ","
}
# When we are looking at the 1st line in an input file...
FNR == 1 {
	# Copy the header line to the output.
	print
	# and skip to the next input line without execcuting the remaining lines
	# of this script for this line.
	next
}
# Accumulate data form remaining lines in the file(s)...
{	# Set "key" to the first four input fields, accumulate the count (field
	# 5) from all lines with they key, and save the last appr field for each
	# key to be printed at the end.
	count[key = $1 OFS $2 OFS $3 OFS $4] += $5
	appr[key] = $6
}
# After we have processed all lines from the input file, print the accumulated
# data.
END {	for(key in count)
		print key, count[key], appr[key]
}' file

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

Shell Programming and Scripting

Aggregate data within the file

9 More Discussions You Might Find Interesting

1. Solaris

IPMP over aggregate in Solaris 11

Discussion started by: sylvain

2. Shell Programming and Scripting

Aggregate variables bdfore ssh into remote host

Discussion started by: varu0612

3. Shell Programming and Scripting

simple aggregate task

Discussion started by: hernand

4. Shell Programming and Scripting

Awk Multiple Files & Aggregate

Discussion started by: magedfawzy

5. IP Networking

Aggregate two internet connections

Discussion started by: tafil

6. UNIX for Advanced & Expert Users

AWK aggregate records

Discussion started by: anaconga

7. UNIX Desktop Questions & Answers

Aggregate title to an archive.log

Discussion started by: enkei17

8. UNIX for Dummies Questions & Answers

Aggregate values in a file & compare with sql output

Discussion started by: shiroh_1982

9. UNIX for Dummies Questions & Answers

aggregate ethernet ports under Solaris

Discussion started by: 98_1LE