Sponsored Content
Top Forums Shell Programming and Scripting Group by a column and first 2 characters of another Post 302956982 by Don Cragun on Monday 5th of October 2015 07:05:53 PM
Old 10-05-2015
You could try something like:
Code:
awk '
BEGIN {	print "\tPR\tGR\tTY\tOTHERS"
	grp["PR"]; grp["GR"]; grp["TY"]
}
{	if(!((g = substr($2, 1, 2)) in grp))
		g = "OTHERS"
	if(!($1 in city))
		citylist[++cc] = city[$1] = $1
	data[$1, g] = data[$1, g] == "" ? $2 : data[$1, g] "|" $2
}
END {	for(i = 1; i <= cc; i++)
		printf("%s\t%s\t%s\t%s\t%s\n", citylist[i],
		    data[citylist[i], "PR"], data[citylist[i], "GR"],
		    data[citylist[i], "TY"], data[citylist[i], "OTHERS"])
}' file

which will work even if the input lines for a given city are not all contiguous. If the data for all cities are grouped on contiguous lines (as in your sample data), this script could be simplified.

This code produces the output you said you wanted except it does not put a space in the heading line after the GR before the tab field separator.

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
These 2 Users Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge group numbers and add a column containing group names

I have a file in the following format. Groups of data merge together and the group number is indicated above each group. 1 adrf dfgr dfg 2 dfgr dfgr 3 dfef dfr fd 4 fgrt fgr fgg 5 fgrt fgr (3 Replies)
Discussion started by: Lucky Ali
3 Replies

2. Shell Programming and Scripting

Merge group numbers and add a column containing group names

Hi All I do have a file like this with 6 columns. Groups of data merge together and the group number is indicated above each group. 1 1 12 26 289 3.2e-027 GCGTATGGCGGC 2 12 26 215 6.7e+006 TTCCACCTTTTG 3 9 26 175 ... (1 Reply)
Discussion started by: Lucky Ali
1 Replies

3. Shell Programming and Scripting

Sum of column by group wise

Hello All , I have a problem with summing of column by group Input File - COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11 3010,21,1923D ,6,0,0.26,0,0.26,-0.26,1,200807 3010,21,192BI ,6,24558.97,1943.94,0,1943.94,22615.03,1,200807 3010,21,192BI... (8 Replies)
Discussion started by: jambesh
8 Replies

4. Shell Programming and Scripting

Group by column from CSV file

Hello Friends, I need a shell script to extract a column in a csv file, group by the column, take the count and print the group with count more than one. Thanks (1 Reply)
Discussion started by: raka_rjit
1 Replies

5. Shell Programming and Scripting

sed or tr to remove specific group of special characters

Hi, I have a input of the form: ..., word1, word2, word3... I want out put of the form word1, word2, word3 I tried echo '..., word1, word2, word3...' | tr -d '...,' but that takes out the commas in the middle too so I get word1 word2 word3 but I want the commas in the middle. ... (3 Replies)
Discussion started by: forumbaba
3 Replies

6. UNIX for Dummies Questions & Answers

Split binary file every occurrence of a group of characters

Hello I am new to scripts, codes, bash, terminal, etc. I apologize this my be very scattered because I frankly don't have any idea where to begin and I have had trouble sleeping lately. I have several 2GB files I wish to split. This Code 00 00 01 BA ** ** ** ** ** ** ** ** C3 F8 00 00 01 BB 00... (17 Replies)
Discussion started by: PatrickE
17 Replies

7. Shell Programming and Scripting

Group by column and concatenate

Hi, Can you please help in resolving below issue: I have input file as below. COL1 COL2 ABC 111asdf ABC 222dfgh ABC 333fghy ABC 4rtyu XYZ 1yuio XYZ 2ytre Lookg for the Output File as below COL1 COL2 ABC '111asdf','222dfgh','333fghy','4rtyu' XYZ ... (6 Replies)
Discussion started by: Phani_Raghava
6 Replies

8. Shell Programming and Scripting

Remove the first character from the fourth column only if the column has four characters

I have a file as follows ATOM 5181 N AMET K 406 12.440 6.552 25.691 0.50 7.37 N ATOM 5182 CA AMET K 406 13.685 5.798 25.578 0.50 5.87 C ATOM 5183 C AMET K 406 14.045 5.179 26.909 0.50 5.07 C ATOM 5184 O MET K... (14 Replies)
Discussion started by: hasanabdulla
14 Replies

9. Shell Programming and Scripting

Group by and translate rows to column

I've a comma separated file with data below: 61401370587,505,1;0402686146,123;2387936.0;20170812 61401370587,505,2;0401296221,34;3.0;20170811 61401370587,505,5;0431169322,123;2387936.0;20170812 My requirement is to group by using 1st,2nd column . And translate the 3rd column's row data... (1 Reply)
Discussion started by: bhagat-reena
1 Replies

10. Shell Programming and Scripting

Group/concatenate certain column and basis on this do addition on other column

Hi Experts, Need your support I want to group/concatenate column 1,2,12 and 13 and if found duplicate then need to sum value of column 17,20,21 and column22. After concatenation if found unique then no action to be taken. Secondly want to make duplicate rows basis on grouping/concatenation of... (1 Reply)
Discussion started by: as7951
1 Replies
COL(1)							    BSD General Commands Manual 						    COL(1)

NAME
col -- filter reverse line feeds from input SYNOPSIS
col [-bfhpx] [-l num] DESCRIPTION
The col utility filters out reverse (and half reverse) line feeds so that the output is in the correct order with only forward and half for- ward line feeds, and replaces white-space characters with tabs where possible. This can be useful in processing the output of nroff(1) and tbl(1). The col utility reads from the standard input and writes to the standard output. The options are as follows: -b Do not output any backspaces, printing only the last character written to each column position. -f Forward half line feeds are permitted (``fine'' mode). Normally characters printed on a half line boundary are printed on the fol- lowing line. -h Don't output multiple spaces instead of tabs (default). -l num Buffer at least num lines in memory. By default, 128 lines are buffered. -p Force unknown control sequences to be passed through unchanged. Normally, col will filter out any control sequences from the input other than those recognized and interpreted by itself, which are listed below. -x Output multiple spaces instead of tabs. The control sequences for carriage motion that col understands and their decimal values are listed in the following table: ESC-7 reverse line feed (escape then 7) ESC-8 half reverse line feed (escape then 8) ESC-9 half forward line feed (escape then 9) backspace moves back one column (8); ignored in the first column carriage return (13) newline forward line feed (10); also does carriage return shift in shift to normal character set (15) shift out shift to alternate character set (14) space moves forward one column (32) tab moves forward to next tab stop (9) vertical tab reverse line feed (11) All unrecognized control characters and escape sequences are discarded. The col utility keeps track of the character set as characters are read and makes sure the character set is correct when they are output. If the input attempts to back up to the last flushed line, col will display a warning message. SEE ALSO
expand(1), nroff(1), tbl(1) STANDARDS
The col utility conforms to Version 2 of the Single UNIX Specification (``SUSv2''). HISTORY
A col command appeared in Version 6 AT&T UNIX. BSD
June 29, 1993 BSD
All times are GMT -4. The time now is 04:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy