awk - printing new lines based of 2 dates


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk - printing new lines based of 2 dates
# 8  
Old 03-22-2016
Hi Don,

Yes you're right, the key is COL2, COL3 and COL4 - Am I right in thinking that this is the bit that needs to be changed to reflect that?

Code:
$2$3$4 != last {
	# $2 has changed, add any needed entries from previous line up to and
	# including the maturity year.
	addlines(startyear + 1, endyear + 1, maturityyear - startyear)
	# Gather year and month & day from fields 7, 8 and, 18.
	split($0, fields)
	last = $2$3$4

There will be one file and the start date will be the same for each key - they will all use reporting day parameter as the start date.

For example for this record the reporting date would be the start date i.e. 2015-12-01 and the end date would be +1 year and -1 day to make sure we cover the whole year

Code:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9,COL10,COL11,COL12,COL13,COL14,COL15,COL16,COL17,COL18
C,1234,TEST,1,AA,AAAA,2019-12-01,2020-11-30,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30

Therefore the output would look like;

Code:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9,COL10,COL11,COL12,COL13,COL14,COL15,COL16,COL17,COL18
C,1234,TEST,1,AA,AAAA,2015-12-01,2016-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30 -- First record with start date = reporting date
C,1234,TEST,1,AA,AAAA,2016-12-01,2017-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30 
C,1234,TEST,1,AA,AAAA,2017-12-01,2018-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30 
C,1234,TEST,1,AA,AAAA,2018-12-01,2019-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,1,AA,AAAA,2019-12-01,2020-11-30,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30  -- Original input record

Thanks for your assistance
# 9  
Old 03-23-2016
If the data in COL2 is the same number of characters in every record and the data in COL3 is the same number of characters in every record, just concatenating those three fields in the awk variable last would be fine. Since I didn't know if those fields are constant width, I included the field separators in the comparison. And, with a few minor modifications to what I suggested before, it seems to work with your new requirements:
Code:
#!/bin/ksh
Report_Year='2015'
awk -v reportyear="$Report_Year" '
NR == 1 {
	print
	FS = OFS = ","
	next
}
function addlines(start1, end1, count) {
	if(last == "") return
	for(i = 0; i < count; i++) {
		for(j = 1; j < 7; j++)
			printf("%s%s", fields[j], OFS)
		printf("%4d%s%s%4d%s%s0%s", start1 + i, startmd, OFS, end1 + i,
		    endmd, OFS, OFS)
		for(j = 10; j <= 18; j++)
			printf("%s%s", fields[j], (j < 18) ? OFS : ORS)
	}
}
$2 FS $3 FS $4 != last {
	# $2 has changed, add any needed entries from previous line up to and
	# including the maturity year.
	addlines(startyear + 1, endyear + 1, maturityyear - startyear)
	# Gather year, month, & day from fields 7 & 8 and year from field 16.
	split($0, fields)
	last = $2 FS $3 FS $4
	startyear = substr($7, 1, 4)
	startmd = substr($7, 5)
	endyear = substr($8, 1, 4)
	endmd = substr($8, 5)
	maturityyear = substr($18, 1, 4)
	maturitymd = substr($18, 5)
	if(startmd > maturitymd)
		maturityyear--
	if(reportyear < startyear) {
		# Add records for missing years before the current record.
		save12 = $12
		fields[12] = sprintf("%.2f", $12 + $9)
		addlines(reportyear, reportyear + endyear - startyear,
		    startyear - reportyear)
		fields[12] = save12
	}
	# Print current record.
	print
	next
}
{	# $2 has not changed since the previous line.  
	# Get new start and end years from fields 7 & 8.
	nstartyear = substr($7, 1, 4)
	nendyear = substr($8, 1, 4)
	# Add any needed entries from previous line to this line.
	addlines(startyear + 1, endyear + 1, nstartyear - startyear - 1)
	# Reset startyear, endyear, and fields[] for next line.
	startyear = nstartyear
	endyear = nendyear
	split($0, fields)
	# Print current record.
	print
}
END {	addlines(startyear + 1, endyear + 1, maturityyear - startyear)
}' file

Additions and changes to the previous script are shown in red.
If file contains:
Code:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9,COL10,COL11,COL12,COL13,COL14,COL15,COL16,COL17,COL18
C,1234,TEST,1,AA,AAAA,2017-12-01,2018-11-30,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,2,AA,AAAA,2017-01-01,2017-12-31,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-12-31

it produces the output:
Code:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9,COL10,COL11,COL12,COL13,COL14,COL15,COL16,COL17,COL18
C,1234,TEST,1,AA,AAAA,2015-12-01,2016-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,1,AA,AAAA,2016-12-01,2017-11-30,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,1,AA,AAAA,2017-12-01,2018-11-30,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,1,AA,AAAA,2018-12-01,2019-11-30,0,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,1,AA,AAAA,2019-12-01,2020-11-30,0,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-11-30
C,1234,TEST,2,AA,AAAA,2015-01-01,2015-12-31,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-12-31
C,1234,TEST,2,AA,AAAA,2016-01-01,2016-12-31,0,AAA,P,4496015.93,NULL,NULL,NULL,NULL,NULL,2020-12-31
C,1234,TEST,2,AA,AAAA,2017-01-01,2017-12-31,190425.71,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-12-31
C,1234,TEST,2,AA,AAAA,2018-01-01,2018-12-31,0,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-12-31
C,1234,TEST,2,AA,AAAA,2019-01-01,2019-12-31,0,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-12-31
C,1234,TEST,2,AA,AAAA,2020-01-01,2020-12-31,0,AAA,P,4305590.22,NULL,NULL,NULL,NULL,NULL,2020-12-31

Which I hope is what you expect. (Although I assume that you won't have any input files where there are multiple maturity months.)

Last edited by Don Cragun; 03-23-2016 at 04:43 PM.. Reason: Remove unneeded split().
This User Gave Thanks to Don Cragun For This Post:
# 10  
Old 03-24-2016
Don,

Thank you so much for the above - working perfectly for what I need.

Thanks for helping with this one, it's been a life saver!
# 11  
Old 03-24-2016
Hi Ads89,
I'm glad it is working for you.

There are two other changes that I should have made, but missed. If you're going to continue using this script, please change the lines containing:
Code:
	# $2 has changed, add any needed entries from previous line up to and
	# including the maturity year.
			and
{	# $2 has not changed since the previous line.

to:
Code:
	# The key has changed, add any needed entries from previous line up to
	# and including the maturity year.
			and
{	# The key has not changed since the previous line.

respectively.

Hopefully, with what you have seen here, you'll have a better chance of being able to do something like this on your own next time.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to reformat lines based on condition

The awk below uses the tab-delimeted fileand reformats each line based on one of three conditions (rules). The 3 rules are for deletion (lines in blue), snv (line in red), and insertion (lines in green). I have included all possible combinations of lines from my actual data, which is very large.... (0 Replies)
Discussion started by: cmccabe
0 Replies

2. Shell Programming and Scripting

awk join lines based on keyword

Hello , I will need your help once again. I have the following file: cat file02.txt PATTERN XXX.YYY.ZZZ. 500 ROW01 aaa. 300 XS 14 ROW 45 29 AS XD.FD. PATTERN 500 ZZYN002 ROW gdf gsste ALT 267 fhhfe.ddgdg. PATTERN ERE.MAY. 280 PATTERRNTH 5000 rt.rt. ROW SO a 678 PATTERN... (2 Replies)
Discussion started by: alex2005
2 Replies

3. Shell Programming and Scripting

UNIX awk pattern matching and printing lines

I have the below plain text file where i have some result, in order to mail that result in html table format I have written the below script and its working well. cat result.txt Page 2015-01-01 2000 Colors 2015-02-01 3000 Landing 2015-03-02 4000 #!/bin/sh LOG=/tmp/maillog.txt... (1 Reply)
Discussion started by: close2jay
1 Replies

4. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field. For example: $cat file 4444 4444 4444 4444 9990 2222,7777 6666 2222 ... (3 Replies)
Discussion started by: torchij
3 Replies

5. Shell Programming and Scripting

awk - printing nth field based on parameter

I have a need to print nth field based on the parameter passed. Suppose I have 3 fields in a file, passing 1 to the function should print 1st field and so on. I have attempted below function but this throws an error due to incorrect awk syntax. function calcmaxlen { FIELDMAXLEN=0 ... (5 Replies)
Discussion started by: krishmaths
5 Replies

6. Shell Programming and Scripting

Help With AWK Matching and Re-printing Lines

Hi All, I'm looking to use AWK to pattern match lines in XML file - Example patten for below sample would be /^<apple>/ The sample I wrote out is very basic compared to what I am actually working with but it will get me started I would like to keep the matched line(s) unchanged but have them... (4 Replies)
Discussion started by: rhoderidge
4 Replies

7. Shell Programming and Scripting

sed/awk : how to delete lines based on IP pattern ?

Hi, I would like to delete lines in /etc/hosts on few workstations, basically I want to delete all the lines for a list of machines like this : for HOST in $(cat stations.lst |uniq) do # echo -n "$HOST" if ping -c 1 $HOST > /dev/null 2>&1 then HOSTNAME_val=`rsh $HOST "sed... (3 Replies)
Discussion started by: albator1932
3 Replies

8. Shell Programming and Scripting

Gawk / Awk Merge Lines based on Key

Hi Guys, After windows died on my netbook I installed Lubuntu and discovered Gawk about a month ago. After using Excel for 10+ years I'm amazed how quick and easily Gawk can process data but I'm stuck with a little problem merging data from multiple lines. I'm an SEO Consultant and provide... (9 Replies)
Discussion started by: Jamesfirst
9 Replies

9. Shell Programming and Scripting

printing two lines in awk as two columns in excel

hi guys, i would like to print two lines from a file as two adjacent columns using excel using awk.. i have this so far: awk '{for(i=1; i<=NF; i++) {printf("%s\n",$i)}}' "$count".ttt > "$count".csv #this to print the first line from the .ttt file as rows of the first column in the .csv... (9 Replies)
Discussion started by: npatwardhan
9 Replies

10. Shell Programming and Scripting

Printing lines with specific awk NF

I have this files: ./frm/lf_mt1_cd.Ic_cell_template.attr ./die/addgen_tb_pumd.Ic_cell_template.attr ./min_m1_n.Ic_cell_template.attr When I use: awk -F\/ '{print NF}' Would result to: 3 3 2 I would like to list the files with 3 fields on it. Any Suggestions? (1 Reply)
Discussion started by: jehrome_rando
1 Replies
Login or Register to Ask a Question