awk solution to duplicate lines based on column

10-10-2013

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field.

For example:

Code:

$cat file
4444     4444            4444     4444
9990     2222,7777       6666     2222   <---this one
1900     1111            2222     4444
1800     0000            5555     8989
1700     3333,4444,5555  8787     4444   <---this one

Code:

$cat output
4444     4444  4444     4444
9990     2222  6666     2222  <---duplicate1
9990     7777  6666     2222  <---duplicate2
1900     1111  2222     4444
1800     0000  5555     8989
1700     3333  8787     4444   <---duplicate1
1700     4444  8787     4444   <---duplicate2
1700     5555  8787     4444   <---duplicate3

Many thanks in advance for your help!

torchij

View Public Profile for torchij

Find all posts by torchij

10-10-2013

Registered User

3,733, 1,154

Join Date: Apr 2009

Last Activity: 3 August 2016, 11:03 AM EDT

Posts: 3,733

Thanks Given: 7

Thanked 1,154 Times in 1,124 Posts

Try:

Code:

awk '$2~","{n=split($2,a,",");for (i=1;i<=n;i++) {$2=a[i];print};next}1' OFS="\t" file

This User Gave Thanks to bartus11 For This Post:

bartus11

View Public Profile for bartus11

Find all posts by bartus11

10-10-2013

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Or in this particular case try:

Code:

awk '{$1=$1; gsub(/,/, OFS $3 OFS $4 ORS $1 OFS)}1' OFS='\t' file

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

10-11-2013

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

Many thanks, both solutions appear to work!

Cheers!

torchij

View Public Profile for torchij

Find all posts by torchij

UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to select lines with maximum value of each record based on column value

Discussion started by: yifangt

2. Shell Programming and Scripting

Solution for replacement of 4th column with 3rd column in a file using awk/sed preserving delimters

Discussion started by: khblts

3. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Discussion started by: parithi06

4. Shell Programming and Scripting

Remove duplicate rows based on one column

Discussion started by: clarissab

5. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Discussion started by: prashob123

6. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Discussion started by: asjaiswal

7. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Discussion started by: polsum

8. Shell Programming and Scripting

awk print non matching lines based on column

Discussion started by: sigh2010

9. Shell Programming and Scripting

AWK Duplicate lines multiple times based on a calculated value

Discussion started by: jamesfx

10. Shell Programming and Scripting

duplicate row based on single column

Discussion started by: mitr