Visit Our UNIX and Linux User Community


sort and summarize


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sort and summarize
# 1  
Old 12-06-2007
sort and summarize

Hi Guys,

I have a file in UNIX with duplicates, I have use sort command as below to delete duplicates based on the KEY positions/columns but now I do not want to "delete" duplicates but summarize by KEY numeric columns.

REALLY NEED HELP... URGENT!!!

Thanks in advance.

sort -k 1.1,1.92 -u file > outfile
# 2  
Old 12-06-2007
Question

I don't think sort does that natively...

If you can provide an example input and an example output showing what you want done, it's probably scriptable.
# 3  
Old 12-06-2007
Here is the example:

1288M99G14 ALA201001+00000000.000+00000005.000
1288M99G14 ALA201001+00000000.000+00000005.000
1288M99G14 ALB201001+00000005.000+00000000.000
1288M99G14 ALA201002+00000000.000+00000017.000
1288M99G14 ALB201001+00000017.000+00000000.000
1288M99G14 ALA201002+00000000.000+00000005.000

Output:

1288M99G14 ALA201001+00000000.000+00000010.000
1288M99G14 ALB201001+00000023.000+00000000.000
1288M99G14 ALA201002+00000000.000+00000023.000

So summarize by first 2 fields
# 4  
Old 12-07-2007
Java

Ah, so it's totalling them...

Sounds like a awk or perl solution would be the way to go.
You can then pipe the ouput through sort to get whatever order you want. You've already got the sort right (without the -u of course) so I'll focus on the totaling part...

As I'm not great with awk, I'll try perl, I'm sure one of the awk wizzes around here can offer up a solution for that Smilie

Code:
#!/bin/perl -w

while (<>) {
  ($name,$left,$right)=split(/\+/);
  $vals{$name}{"left"}+=$left;
  $vals{$name}{"right"}+=$right;
}

foreach $name (keys %vals) {
  printf "%s\+%012.3f\+%012.3f\n",${name},$vals{$name}{'left'},$vals{$name}{'right'};
}

# 5  
Old 12-07-2007
Try this one

filename=$1
sort $filename|
awk ' BEGIN {FS="+"; prev_key1=""; prev_key2=0; prev_key2=0; first=1; }
{
# print "asdfdafsdfsdfasf|"prev_key1 "|"$1
if($1==prev_key1)
{
prev_key2 += $2;
prev_key3 += $3;
}
else
{
if(!first)
printf("%20.20s+%08.3f+%08.3f\n",prev_key1,prev_key2,prev_key3);
else first=0;
prev_key1 = $1;
prev_key2 = $2;
prev_key3 = $3;
}
}
END {printf("%20.20s+%08.3f+%08.3f\n",prev_key1,prev_key2,prev_key3);}'
# 6  
Old 12-09-2007
Quote:
Originally Posted by ranjithpr
filename=$1
sort $filename|
awk ' BEGIN {FS="+"; prev_key1=""; prev_key2=0; prev_key2=0; first=1; }
...
The sort needs to be smarter, the OP was not sorting by the first element (but they have that bit working so I just left it out of the solution Smilie )
# 7  
Old 12-09-2007
Another Awk/sort try:

Code:
awk '{
	x[$1] += $2
	y[$1] += $3
} END {
	for (e in x)
		printf "%s+%012.3f+%012.3f\n",
		e, x[e], y[e]
}' FS="+" filename|sort -t " " -k2.4n

Use nawk or /usr/xpg4/bin/awk on Solaris
(or just write the printf statement on one line Smilie).

Previous Thread | Next Thread
Test Your Knowledge in Computers #587
Difficulty: Medium
All programming languages support recursion.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk to Summarize Log File in 5min Intervals

I have huge log file that taken every minute and I need the total at 5min intervals. Sample log: #timestamp(yyyymmddhhmm);result;transaction 201703280000;120;6 201703280001;120;3 201703280002;105;3 201703280003;105;5 201703280004;105;5 201703280005;105;4 201703280006;120;2... (2 Replies)
Discussion started by: wwolfking
2 Replies

2. Shell Programming and Scripting

Bash cript to calculate summarize address

Hi, I need to write a bash script that when i enter two ip address, it will calculate summerize address for them. Examlpe: 192.168.1.27/25 192.168.1.129/25 Result will be: 192.168.1.0/24 can you help me with this script? I even dont know how to start with it (3 Replies)
Discussion started by: Miron
3 Replies

3. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

4. Programming

Can someone summarize what exactly this perticular code is doing

#include<stdio.h> #include<string.h> int main() { char a={0,1,2,3,4,5,6,7,8,9}; printf("\n--%s-- unable to access values",a); printf("\n--%d %d-- able to access through direct acess",a,a); printf("\n--%d-- but the failing to read the size\n",strlen(a)); return 0; } (2 Replies)
Discussion started by: hk108
2 Replies

5. Shell Programming and Scripting

Summarize the values from files

One of my process will create a file Market.txt with data like below. Count Markt file 334936 /pdm/data001/P3_Quest_5HT_AMERGE.csv 2770787 /pdm/data001/P3_Quest_ARB_ATACAND.csv 1198143 /pdm/data001/P3_Quest_Bisp_ACTONEL.csv 3821864 /pdm/data001/P3_Quest_CONTRA_ALL_OTHER_CONTRA.csv... (7 Replies)
Discussion started by: katakamvivek
7 Replies

6. Shell Programming and Scripting

Summarize file with column matching

Guys, Please help me with this code. I have 2GB file to process and shell seems to be the best option. I am a biologist and though I can think of the logic, the commands are beyond me. Any help is greatly appreciated. Please look at the attched file and the requirement will be very clear. I... (6 Replies)
Discussion started by: newbie83
6 Replies

7. Shell Programming and Scripting

Using SED/AWK to Summarize Log File in 10min Intervals

I have this huge log file on my linux box that gets generated every day. I'm able to extract the information I need; however I really would like it to be broken down every 10mins. Log File Snippet 01:23:45 MARYHADA Maryhadalittle.lamb(): fleece as white as snow 1394 for and everywhere that... (8 Replies)
Discussion started by: ravzter
8 Replies

8. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

9. Shell Programming and Scripting

1. search 2nd pattern after a pattern and summarize stats

I have two questions. I am sure one of the Guru will be able to help either one or both. 1. Find 2nd occurance of pattern= "Bind variable after pattern="ABN USER Admin" ...... ABN USER Admin <--- I know this string ..... Bind variable ... .. Bind variable <-- Want to print this... (4 Replies)
Discussion started by: ran123
4 Replies

10. Shell Programming and Scripting

Summarize the sed script

Hi folks, I have a situation where i have a raw file like cat file_raw 776 713 111 0776713113 317-713-114 235776713115 776713116 336713117 77 6 713 118 0776713119 235776713120 and would like to replace all leading zeros with 235, remove all spaces and dashes, and make all... (3 Replies)
Discussion started by: jerkesler
3 Replies

Featured Tech Videos