Transposing data based on 1st column

08-11-2016

Registered User

44, 0

Join Date: Sep 2013

Last Activity: 30 August 2016, 2:32 PM EDT

Posts: 44

Thanks Given: 8

Thanked 0 Times in 0 Posts

Transposing data based on 1st column

I do have a big tab delimited file of the following format

Code:

aa 344 456
aa 34 67
bb 34 90
bb 23 100
bb 1 89
d 0 12
e 45 678
e 78 90
e 56 90
....
....
....

I would like to transpose the data based on the category on column one and get the output file in the following tab delimited format format:

Code:

a 344-456, 34-67
b 34-90,23-100,1-89
d 0-12
e 45-678,78-90,56-90

Please let me know the best way to do this using awk or sed

Kanja

View Public Profile for Kanja

Find all posts by Kanja

08-11-2016

Moderator

3,105, 1,603

Join Date: May 2013

Last Activity: 31 August 2020, 1:46 AM EDT

Location: Chennai

Posts: 3,105

Thanks Given: 1,269

Thanked 1,603 Times in 1,369 Posts

Hello kanja,

If you don't bother of the sequence of the output like how it is in Input_file then following may help you in same.

Code:

awk '{A[$1]=A[$1]?A[$1] OFS $2"-"$3:$2"-"$3} END{for(i in A){print i FS A[i]}}' OFS=", "   Input_file

Output will be as follows.

Code:

bb 34-90, 23-100, 1-89
d 0-12
e 45-678, 78-90, 56-90
aa 344-456, 34-67

If you want output in same sequence as Input_file then following may help you in same.

Code:

awk 'FNR==NR{A[$1]=A[$1]?A[$1] OFS $2"-"$3:$2"-"$3;next} ($1 in A){print $1 FS A[$1];delete A[$1]}' OFS=", "  Input_file  Input_file

Output will be as follows.

Code:

aa 344-456, 34-67
bb 34-90, 23-100, 1-89
d 0-12
e 45-678, 78-90, 56-90

Thanks,
R. Singh

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

08-11-2016

Registered User

446, 232

Join Date: May 2016

Last Activity: 12 May 2020, 4:52 AM EDT

Posts: 446

Thanks Given: 51

Thanked 232 Times in 163 Posts

Code:

awk '{i=substr($0,1,1) ;v[i]=v[i]?v[i]","$2"-"$3:$2"-"$3;} END{ for (i in v) print i"\t"v[i]}' input_file

Last edited by stomp; 08-11-2016 at 05:55 PM.. Reason: replaced space with TAB in output

stomp

View Public Profile for stomp

Find all posts by stomp

08-11-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

You said that your input file has <tab> delimited fields, but the sample input and output you provided is <space> delimited.

If your input file has all records for a given first character of the first field on adjacent lines (as in your example), you could try this simpler approach which uses less memory (and, therefore, should run a little faster) and keeps the output order the same as the input order. It assumes that you want a <tab> separating fields in the output, but I assume you can see how to change that to a <space> if that is what you want:

Code:

awk '
last != substr($1, 1, 1) {
	if(NR > 1) print ""
	last = substr($1, 1, 1)
	printf("%s\t%s-%s", last, $2, $3)
	next
}
{	printf(",%s-%s", $2, $3)
}
END {	if(NR > 0) print ""
}' file

which, with your sample input, produces the output:

Code:

a	344-456,34-67
b	34-90,23-100,1-89
d	0-12
e	45-678,78-90,56-90

For any of the awk scripts suggests in this thread, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

08-15-2016

Registered User

44, 0

Join Date: Sep 2013

Last Activity: 30 August 2016, 2:32 PM EDT

Posts: 44

Thanks Given: 8

Thanked 0 Times in 0 Posts

Thank you all

I see that you all assume that the first column only contains characters. If it had also some numbers along with the character in the first column , how do I modify the awk command?. Also the input file is tab-delimited.

for example:

Code:

aa123 344 456
aa123 34 67
bb34 34 90
bb34 23 100
bb34 1 89
d3 0 12
e55 45 678
e55 78 90
e55 56 90
....
....
....

Kanja

View Public Profile for Kanja

Find all posts by Kanja

08-15-2016

Registered User

176, 67

Join Date: Nov 2013

Last Activity: 21 February 2019, 3:36 AM EST

Posts: 176

Thanks Given: 14

Thanked 67 Times in 63 Posts

@Kanja

No assumptions have been made - only answers based upon your input data/expected output

This User Gave Thanks to pilnet101 For This Post:

pilnet101

View Public Profile for pilnet101

Find all posts by pilnet101

08-15-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Your sample data in post #1 in this thread explicitly showed that no matter how many characters were in field 1 in your input file, you only wanted the 1st character of field 1 to appear in your output file. Have you now changed you mind on which lines are to be grouped together??? If so, please explicitly state your new requirements.

Also note that you say your input file is <tab> delimited, but every sample file you have shown us is delimited by a single <space> character; not a <tab>.

Last edited by Don Cragun; 08-15-2016 at 01:26 PM.. Reason: Fix typo: s/tabb/tab/

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

Shell Programming and Scripting

Transposing data based on 1st column

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To append new data at the end of each line based on substring of last column

Discussion started by: null7

2. Shell Programming and Scripting

Inserting column data based on category assignment

Discussion started by: ritakadm

3. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Discussion started by: sargotrons

4. Shell Programming and Scripting

Generate tabular data based on a column value from an existing data file

Discussion started by: himanish

5. Shell Programming and Scripting

Calculate 2nd Column Based on 1st Column

Discussion started by: attila

6. Shell Programming and Scripting

Help newbie: transposing column into row (pivot)

Discussion started by: sirrtuan

7. Shell Programming and Scripting

Help with analysis data based on particular column content

Discussion started by: perl_beginner

8. Shell Programming and Scripting

Remove duplicate line detail based on column one data

Discussion started by: patrick87

9. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

Discussion started by: patrick87

10. UNIX for Dummies Questions & Answers

Transposing data output

Discussion started by: bazzabogan