forming duplicate rows based on value of a key

03-07-2010

Banned

45, 0

Join Date: Mar 2010

Last Activity: 30 August 2010, 10:28 AM EDT

Posts: 45

Thanks Given: 3

Thanked 0 Times in 0 Posts

forming duplicate rows based on value of a key

if the key (A or B or ...others) has 4 in its 3rd column the 1st A row has to form 4 dupicates along with the all the values of A in 4th column (2.9, 3.8, 4.2) .
Hope I explain the question clearly.

Cheers
Ruby

input

Code:

"A"        1           4           2.9
"A"        2           5           3.8
"A"        3           3           4.2
"B"        1           3           3.6

output

Code:

"A"        1           2.9
"A"        1           3.8
"A"        1           4.2
"A"        1           -
"A"        2           2.9
"A"        2           3.8
"A"        2           4.2
"A"        2           -
"A"        2           -
"A"        3           2.9
"A"        3           3.8
"A"        3           4.2
"B"        1           3.6
"B"        1           -
"B"        1           -

Last edited by ruby_sgp; 03-07-2010 at 09:22 AM..

ruby_sgp

View Public Profile for ruby_sgp

Find all posts by ruby_sgp

03-07-2010

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Could you please explain why:

Code:

"B"        1           3.6
"B"        1           3.6
"B"        1           3.6

And not:

Code:

"B"        1           3.6
"B"        1           -
"B"        1           -

radoulov

View Public Profile for radoulov

Find all posts by radoulov

03-07-2010

Banned

45, 0

Join Date: Mar 2010

Last Activity: 30 August 2010, 10:28 AM EDT

Posts: 45

Thanks Given: 3

Thanked 0 Times in 0 Posts

mistake

sorry my bad. you are right. it is

Code:

"B"        1           3.6
"B"        1           -
"B"        1           -

ruby_sgp

View Public Profile for ruby_sgp

Find all posts by ruby_sgp

03-07-2010

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Use gawk, nawk or sawk on Solaris:

Code:

awk 'END {
  for (i = 0; ++i <= NR;) {
    split(n[i], t, SUBSEP); K = t[1]
    K == pk || N = split(v[K], tt)
    for (j = 0; ++j <= k[n[i]];)
      print t[1], t[2], j <= N ? tt[j] : "-" 
    }	 
    pk = K
  }
{ 
  v[$1] = $1 in v ? v[$1] FS $NF : $NF
  k[$1, $2] = $3; n[NR] = $1 SUBSEP $2
  }' OFS='\t' infile

Last edited by radoulov; 03-07-2010 at 09:49 AM.. Reason: code formatting

radoulov

View Public Profile for radoulov

Find all posts by radoulov

03-07-2010

Banned

45, 0

Join Date: Mar 2010

Last Activity: 30 August 2010, 10:28 AM EDT

Posts: 45

Thanks Given: 3

Thanked 0 Times in 0 Posts

thnx

thank you so much. working great.

---------- Post updated at 08:54 AM ---------- Previous update was at 08:51 AM ----------

ruby_sgp

View Public Profile for ruby_sgp

Find all posts by ruby_sgp

03-24-2010

Banned

45, 0

Join Date: Mar 2010

Last Activity: 30 August 2010, 10:28 AM EDT

Posts: 45

Thanks Given: 3

Thanked 0 Times in 0 Posts

hi small change in input1. Could you please modify the code based on the following input. Thanx in advance
Ruby

Code:

A	1	4	2.9	X/X	ggfgg
A	2	5	3.8	Y/Y	ghfghf
A	3	3	4.2	Z/Z	gg667
A	NULL	null	2.9	null	null
A	null	null	10.4	null	null
B	1	3	3.6	N/N	hjjghjg

output

Code:

A	1	2.9	X/X	ggfgg
A	1	3.8	X/X	ggfgg
A	1	4.2	X/X	ggfgg
A	1	2.9	X/X	ggfgg
A	-	10.4	-	ggfgg
A	2	2.9	Y/Y	ghfghf
A	2	3.8	Y/Y	ghfghf
A	2	4.2	Y/Y	ghfghf
A	2	2.9	Y/Y	ghfghf
A	2	10.4	Y/Y	ghfghf
A	3	2.9	Z/Z	gg667
A	3	3.8	Z/Z	gg667
A	3	4.2	Z/Z	gg667
A	-	2.9	-	gg667
A	-	10.4	-	gg667
B	1	3.6	N/N	hjjghjg
B	1	-	-	hjjghjg
B	1	-	-	hjjghjg

Last edited by ruby_sgp; 03-24-2010 at 08:48 PM..

ruby_sgp

View Public Profile for ruby_sgp

Find all posts by ruby_sgp

03-25-2010

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Quote:

Originally Posted by ruby_sgp

hi small change in input1. Could you please modify the code based on the following input.
[...]

Try this:

Code:

awk -F'\t' 'END {
  for (i = 0; ++i <= c;) {
    rnf = split(r[i], t)
    pt1 == t[1] || vnf = split(v[t[1]], tt)
    max = vnf > t[3] ? vnf : t[3]
    for (j = 0; ++j <= max;) 
      print t[1], (t[3] >= j ? t[2] : "-"), \
      tt[j] ? tt[j] : "-", (t[3] >= j ? t[5] : "-"), t[6]
    pt1 = t[1]
    }
  }
{
  v[$1] = v[$1] ? v[$1] FS $4 : $4
  $3 + 0 > 0 && r[++c] = $0
  }' OFS='\t' infile

However, the fourth field in the last three lines is a bit different,
because I don't understand the logic in that point

radoulov

View Public Profile for radoulov

Find all posts by radoulov

UNIX for Dummies Questions & Answers

forming duplicate rows based on value of a key

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Discussion started by: CHoggarth

2. Shell Programming and Scripting

Convert rows to columns based on key and count

Discussion started by: syam1406

3. Shell Programming and Scripting

Remove duplicate rows based on one column

Discussion started by: clarissab

4. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Discussion started by: machomaddy

5. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Discussion started by: informaticist

6. Shell Programming and Scripting

Duplicate rows in CSV files based on values

Discussion started by: vbhonde11

7. Shell Programming and Scripting

how to delete duplicate rows based on last column

Discussion started by: reva

8. Shell Programming and Scripting

Duplicate rows in CSV files based on values

Discussion started by: Incrediblian

9. Shell Programming and Scripting

How to delete duplicate records based on key

Discussion started by: sumitc

10. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Discussion started by: risk_sly