grouping based on first column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting grouping based on first column
# 1  
Old 02-08-2012
grouping based on first column

I do have a tab delimited file of the following format
Code:
a_1 rt
a_1 st_2
a_1 st_3
a_2 bt_2
a_2 st_er
b_2 st_2
b_2 st_32
S_1 rt_8
S_1 rt_64

I want to cut short the above file and group the file based on the first column like below.

Code:
a_1 rt st_2  st_3
a_2 bt_2 st_er
b_2 st_2 st_32
S_1 rt_8 rt_64

My file is a big text file and I would like to know the best way to do it using awk or sed.
Please let me know
# 2  
Old 02-08-2012
Simple, sorted order:

Code:
awk '
    { stuff[$1] = stuff[$1] $2 " " }
    END {
        for( s in stuff )
            print s, stuff[s];
    }' input-file |sort

If order of output must match order that things in column 1 were seen:
Code:
awk '
    {
        if( !seen[$1]++ )
            order[++oidx] = $1;
        stuff[$1] = stuff[$1] $2 " "
    }
    END {
        for( i = 1; i <= oidx; i++ )
            print order[i], stuff[order[i]]
    }
' input-file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to grouping time and based on value with multiple pattern?

Hi All, need help... I have some log below : ### {"request_id":"e8395eb0-a8bd-11e9-b77b-d507ea5312aa","message":"when inquiry paybill 628524871 prevalidation cause : Invalid Transaction"} ### {"request_id":"043f2310-a8be-11e9-b57b-f9c7344998d7","message":"when inquiry paybill 62821615... (2 Replies)
Discussion started by: fajar_3t3
2 Replies

2. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

3. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

4. Shell Programming and Scripting

Finding max of a column grouping by the time

Hi, I have the below text: 16:00 0.50 16:00 0.30 16:00 0.00 16:00 0.00 16:00 0.30 16:01 0.00 16:01 0.30 I want to find the max of the 2nd column grouping by the values in the 1st column using awk. So 16:00 0.50 16:01 0.30 I have tried (3 Replies)
Discussion started by: satishrao
3 Replies

5. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

6. Shell Programming and Scripting

grouping log files based on counter

I have my log file as below 00:18:02 - Nothing normal; Garbage Collection kicked off & running from last 3 min... 00:19:02 - Nothing normal; Garbage Collection kicked off & running from last 4 min... 00:19:02 - Nothing normal; Garbage Collection kicked off & running from last 4 min...... (11 Replies)
Discussion started by: manas_ranjan
11 Replies

7. Shell Programming and Scripting

AWK script to create max value of 3rd column, grouping by first column

Hi, I need an awk script (or whatever shell-construct) that would take data like below and get the max value of 3 column, when grouping by the 1st column. clientname,day-of-month,max-users ----------------------------------- client1,20120610,5 client2,20120610,2 client3,20120610,7... (3 Replies)
Discussion started by: ckmehta
3 Replies

8. Shell Programming and Scripting

Filtering lines for column elements based on corresponding counts in another column

Hi, I have a file like this ACC 2 2 21 aaa AC 443 3 22 aaa GCT 76 1 33 xxx TCG 34 2 33 aaa ACGT 33 1 22 ggg TTC 99 3 44 wee CCA 33 2 33 ggg AAC 1 3 55 ddd TTG 10 1 22 ddd TTGC 98 3 22 ddd GCT 23 1 21 sds GTC 23 4 32 sds ACGT 32 2 33 vvv CGT 11 2 33 eee CCC 87 2 44... (1 Reply)
Discussion started by: polsum
1 Replies

9. UNIX for Dummies Questions & Answers

Please help me to find out maximum value of a field based on grouping of other fields.

Please help me to find out maximum value of a field based on grouping of other fields, as we do in SQL. Like in SQL if we are having below records : Client_Name Associate_Name Date1 Value C1111 A1111 2012-01-17 10 C1111 A1111 ... (1 Reply)
Discussion started by: KamalKumarKalra
1 Replies

10. Shell Programming and Scripting

Help with grouping data based on range position

Input file: data_1 1000 1290 data_4 290 234 data_2 1114 1110 data_5 534 999 data_6 900 1050 . . Desired_output_file_1_0_999: data_4 290 234 data_5 534 999 Desired_output_file_2_1000_1999: data_1 1000 1290 data_2 1114 1110 (1 Reply)
Discussion started by: perl_beginner
1 Replies
Login or Register to Ask a Question