awk repeat one field at all lines and modify field repetitions

09-18-2015

Registered User

39, 1

Join Date: May 2014

Last Activity: 13 December 2016, 5:35 AM EST

Posts: 39

Thanks Given: 45

Thanked 1 Time in 1 Post

awk repeat one field at all lines and modify field repetitions

Hello experts

I have a file with paragraphs begining with a keeping date and ending with "END":

Code:

20120301 num num
John        num num A keepnum1 num num
kathrin     num num A keepnum1 num num
kathrin     num num B keepnum2 num num
Pete        num num A keepnum1 num num
Jacob       num num A keepnum1 num num
Jacob       num num B keepnum2 num num
END
20111203 num num
Eva        num num A keepnum1 num num
Adam       num num A keepnum1 num num
Adam       num num B keepnum2 num num       
Jacob      num num A keepnum1 num num
Jacob      num num B keepnum2 num num
END
...

where "num"=random number that will be discarded, keepnum=keeping random number
What I want is
1. to repeat the date in all lines of each paragraph before keepnum1
2. to take the B keepnum2 of each duplicate name and place it in the first accuring line.
So that the output would look like:

Code:

John        A 20120301 keepnum1 
kathrin     A 20120301 keepnum1  B keepnum2 
Pete        A 20120301 keepnum1 
Jacob        A 20120301 keepnum1  B keepnum2 
END
20111203 
Eva        A 20111203 keepnum1 
Adam      A 20111203  keepnum1  B keepnum2       
Jacob     A 20111203 keepnum1 B keepnum2 
END
....

I have only managged to take the dates with the following command

Code:

 awk ' {RS="END"}; {print $1};'

But with no further luck. Any suggestions? (OS Opensuse)

phaethon

View Public Profile for phaethon

Find all posts by phaethon

09-19-2015

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

The following seems to do what you want, but since your output format is not consistent (no date before first output group, inconsistent spacing), I'm not sure if this is what you want:

Code:

awk '
!head {	print date = $1
	head = 1
	last = ""
	next
}
/^END/ {head = 0
	if(last != "")
		print ""
	print "END"
	next
}
$1 != last {
	if(last != "")
		print ""
	printf("%s\tA %s %s", last = $1, date, $5)
	next
}
{	# To get here, $1 on this line and $1 on the previous line must match...
	printf("\tB %s", $5)
}' file

which, if file contains your sample input, produces the output:

Code:

20120301
John	A 20120301 keepnum1
kathrin	A 20120301 keepnum1	B keepnum2
Pete	A 20120301 keepnum1
Jacob	A 20120301 keepnum1	B keepnum2
END
20111203
Eva	A 20111203 keepnum1
Adam	A 20111203 keepnum1	B keepnum2
Jacob	A 20111203 keepnum1	B keepnum2
END

If someone wants to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

09-19-2015

Registered User

39, 1

Join Date: May 2014

Last Activity: 13 December 2016, 5:35 AM EST

Posts: 39

Thanks Given: 45

Thanked 1 Time in 1 Post

Thank you very much! it worked! Have no words!

phaethon

View Public Profile for phaethon

Find all posts by phaethon

Shell Programming and Scripting

awk repeat one field at all lines and modify field repetitions

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Problem with getting awk to multiply a field by a value set based on condition of another field

Discussion started by: cotilloe

2. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

Discussion started by: cmccabe

3. Shell Programming and Scripting

awk to adjust coordinates in field based on sequential numbers in another field

Discussion started by: cmccabe

4. Shell Programming and Scripting

How can awk ignore the field delimiter like comma inside a field?

Discussion started by: gopal.biswal

5. Shell Programming and Scripting

awk to parse field and include the text of 1 pipe in field 4

Discussion started by: cmccabe

6. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

Discussion started by: right_coaster

7. Shell Programming and Scripting

Awk Search text string in field, not all in field.

Discussion started by: rocket_dog

8. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Discussion started by: kevintse

9. Shell Programming and Scripting

awk field equal something, then add something to the field

Discussion started by: jimmy_y

10. UNIX for Dummies Questions & Answers

awk - Summing a field based on another field

Discussion started by: treesloth