Changing CSV files with date . Subtracting date by values


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Changing CSV files with date . Subtracting date by values
# 1  
Old 11-01-2018
Changing CSV files with date . Subtracting date by values

Hi All,

I have a CSV file which is as below. Basically I need to take the year column in it and find if the year is >= 20152 . If that is then I should subtract all values by 6. In the below example in description I am having number mentioned as YYWW so I need to subtract those by -5. Whereever I find the year I have ti subtract by -5. If the year is >201601 then I have to subtrct by -6. The year representation is 52 week. so if the week falls on 03 for example 201403 then the subtraction of -6 will yield 201347. I am planning to do in C++ ,not sure if this possible with awk or sed.

Code:
Year representation goes like 
201101
201103
..
..
..
201151
201152
201201
201202

Original
Code:
id,description,type,year,obj
994475,1832 +TRANS     1835 10/17/18,S,201835,P
994477,1836 + NOTAPP 1839 10/17/18,S,201839,1
828058,CONTROL 1452-1527,1552-1627,S,201627,OP
828059,1452-1527,1552-1627,S,201627,UU

Modified
Code:
id,description,type,year
994475,1820 +TRANS     1829 10/17/18,S,201829,P         ---------------------  Year is 2018  should be subtracted by 6
994477,1830 + NOTAPP 1833 10/17/18,S,201833 ,1         ---------------------  Year is 2018  should be subtracted by 6
828058,CONTROL 1436-1521,1546-1621,S,201621,OP  ---------------------  Year is 2016  should be subtracted by 6
828059,1447-1522,1547-1602 ,S,201622,UU ---------------------                Year is 2015  should be subtracted by 5


Last edited by RudiC; 11-01-2018 at 04:24 PM..
# 2  
Old 11-02-2018
Sorry, for the most part i thought i understood your goal, but i am a bit confused:

Quote:
Originally Posted by arunkumar_mca
and find if the year is >= 20152 .
Is that simply a typo or can the week numbers be either one or two digits? Is the value above in fact reading "201502" or "20152x" or could the second week in 2015 be both represented by "201502" or "20152"?

Either way, it is easily done in awk, but with different algorithms, obviously.

bakunin
# 3  
Old 11-02-2018
Here is my try for the operation, perhaps a more shorter / better solution can be made ...

Code:
BEGIN {
FS=","
mweek=52
}
NR > 1{
year=substr($(NF-1),1,4)
week=substr($(NF-1),5,6)
variance=( year > 2015 ) ? 6 : 5

if ( int(week) == variance ) { week=mweek ; year=year - 1 ; sub($(NF-1),year week,$(NF-1)) }
else if ( int(week) < variance ) { week=mweek - (variance - week) ; year=year-1 ; sub($(NF-1),year week,$(NF-1)) }
else { week=sprintf("%02d",week - variance) ; sub($(NF-1),year week,$(NF-1)) }
} 1

Save as program.awk and run as awk -f program.awk input
Year is hardcoded and if variance needs to change, so does zero padding in week variable declaration.

Your input seems the have year on field $5 or $6 which varies on lines, but always on $(NF-1) or a one field before last ?

Hope that helps
Regards
Peasant.
# 4  
Old 11-02-2018
First my apologize for not putting the actual data.

Code:
ORIGINAL
994475;1832 +  S PP1835 10/17/18;S;P201835;115;N;4,4;M;0;xx994475;*;BA7005;10/17/2018 16:48
994477;1836 +  S PP1839 10/17/18;S;P201839;115;N;4,4;M;0;xxh994477;*;BA7005;10/17/2018 16:48
994479;CONTROL 1452-1527,1552-1627;P201527;115;N;4,4;M;0;RDHSYNDCT_12_1515FF_0706;*;B7005;10/17/2018 16:49


EXPECTED
994475;1826+  S PP1829 10/17/18;S;P201829;115;N;4,4;M;0;xx994475;*;BA7005;10/17/2018 16:48  ---> Subtract column  2,4,10 by -5 if it is 2015 lesser or by -6 if that is 2016 or greater
994477;1830 +  S PP1833 10/17/18;S;P201833;115;N;4,4;M;0;xxh994477;*;BA7005;10/17/2018 16:48  ---> Subtract column  2,4,10 by -5 if it is 2015 lesser or by -6 if that is 2016 or greater
994479;CONTROL 1447-1522,1547-1622;S;P201522;115;N;4,4;M;0;RHS_12_1510FF_0706;*;B7005;10/17/2018 16:49  ---> Subtract column  2,4,10 by -5 if it is 2015 lesser or by -6 if that is 2016 or greater


The Year I should take will always be on column 4. in the above scenario. We are having as P201835. The column I need check the do the subtraction is 2,4,10


I tried changing the awk code like this and ran. It give me the same output as original and no change


Code:
BEGIN {
FS=";"
mweek=52
}
NR > 1{
year=substr($(NF-9),1,4)
week=substr($(NF-9),5,6)
variance=( year > 2015 ) ? 6 : 5

if ( int(week) == variance ) { week=mweek ; year=year - 1 ; sub($(NF-9),year week,$(NF-9)) }
else if ( int(week) < variance ) { week=mweek - (variance - week) ; year=year-1 ; sub($(NF-9),year week,$(NF-9)) }
else { week=sprintf("%02d",week - variance) ; sub($(NF-9),year week,$(NF-9)) }
} 1


Last edited by arunkumar_mca; 11-02-2018 at 12:14 PM..
# 5  
Old 11-02-2018
Fixed some typos..

I'm not sure i follow.
Input now is also inconsistent, with first two rows having 13 fields and last having 12.

Now you say you require fields 2,4 and 10, but on expected output you changed only field 4 for the first two lines and field 5 for the last line.

I have no idea what to do with 2,10, but we can work with 4 and 5 using awk match and regex.
Will it be a good guess now, or are we missing some input again ?

Code:
BEGIN {
FS=";"
mweek=52
}
#NR > 1 { # if we do not have header in our input ...
{
match($0,/P[12][0-9][0-9][0-9][0-5][0-9]/)
dw=substr($0,RSTART+1,RLENGTH-1)
year=substr(dw,1,4)
week=substr(dw,5,6)
variance=( year > 2015 ) ? 6 : 5

if ( int(week) == variance ) { week=mweek ; year=year - 1 ; sub(dw,year week,$0) }
else if ( int(week) < variance ) { week=mweek - (variance - week) ; year=year-1 ; sub(dw,year week,$0) }
else { week=sprintf("%02d",week - variance) ; sub(dw,year week,$0) }
} 1

Be sure other fields in line do not match P<year week regex>, since we are using $0

Please, read about NF, RSTART, RLENGTH here (for gawk, but is available on other awk(s) as well ) :
ftp://ftp.gnu.org/pub/old-gnu/Manual...11.html#SEC110


Regards
Peasant.

Last edited by Peasant; 11-02-2018 at 01:02 PM..
This User Gave Thanks to Peasant For This Post:
# 6  
Old 11-02-2018
I missed the column in the input.

Code:
Input with columns
994475;1832 +  S PP1835 10/17/18;S;P201835;115;N;4,4;M;0;xx994475;*;BA7005;10/17/2018 16:48
994477;1836 +  S PP1839 10/17/18;S;P201839;115;N;4,4;M;0;xxh994477;*;BA7005;10/17/2018 16:48
994479;CONTROL 1452-1527,1552-1627;S;P201527;115;N;4,4;M;0;RHS_12_1515FF_0706;*;B7005;10/17/2018 16:49

I changed all column in the expected output. If you see in column 4 I have change P201835 to P201829. For column 2 and 10. It is like changing the YYYY , If you look I changed 1832 to 1826. And in Column 10 for last row I changed from RHS_12_1515FF_0706
to RHS_12_1510FF_0706

Ran it and created a same ouput as original
Code:
(K>)  awk -f change.awk input.csv
994479;CONTROL 1452-1527,1552-1627;S;P201527;115;N;4,4;M;0;RHS_12_1515FF_0706;*;B7005;10/17/2018 16:49
(K>) cat input.csv
994479;CONTROL 1452-1527,1552-1627;S;P201527;115;N;4,4;M;0;RHS_12_1515FF_0706;*;B7005;10/17/2018 16:49


Last edited by arunkumar_mca; 11-02-2018 at 01:17 PM..
# 7  
Old 11-02-2018
All the subtraction done on fields ($2,$10) depends on year on field $4, the variance defined.
If no input on $2 matches string PP<number>, $2 is printed as is, unchanged.
If no input on $10 matches string RHS, $10 is printed as is, unchanged.

Hopefully that's it :

Code:
BEGIN {
OFS=FS=";"
mweek=52
}

NR > 1 {
match($4,/P[12][0-9][0-9][0-9][0-5][0-9]/)
dw=substr($4,RSTART+1,RLENGTH-1)
year=substr(dw,1,4)
week=substr(dw,5,6)
variance=( year > 2015 ) ? 6 : 5

if ( match($10,/RHS/) ) {
	split($10,g,"_")
	u=g[1]"_"g[2]"_"int(g[3]) - variance"FF_"g[4]
	sub($10,u,$10)
	}

if ( match($2,/PP[0-9]+/) ) {
	a="PP"substr($2,RSTART+2,RLENGTH-2) - variance
	sub(substr($2,RSTART,RLENGTH),a,$2)
	}

if ( int(week) == variance ) {
	week=mweek ; year=year - 1 ; sub(dw,year week,$0)
	}
else if ( int(week) < variance ) {
	week=mweek - (variance - week) ; year=year-1 ; sub(dw,year week,$0)
	}
else {
	week=sprintf("%02d",week - variance) ; sub(dw,year week,$0)
	}
} 1

Regards
Peasant.
This User Gave Thanks to Peasant For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Subtracting a date based on ls output

I am trying to achieve to get only the month and the day. Example Feb 5 (as you can see if it is feb 1-9) the space is 2. If it is feb 10-28, the space is only 1. I am trying to right a script that will list a directory and shoot an email if there is an activity in last 7 days. I dont really trust... (5 Replies)
Discussion started by: invinzin21
5 Replies

2. Shell Programming and Scripting

Subtracting months from passed Date

Hi All, I am getting a date from environment variable and want to do some processing by subtracting 2 months from the date passed through the environment variable. I am trying the following syntax : date_var=2014-08-31 date_2M_ago='$date_var+"%d%m%y" --$date_var="2 months ago" '... (3 Replies)
Discussion started by: Rahul Raj
3 Replies

3. Shell Programming and Scripting

Adding/ Subtracting from Date

Hi , How can I add/substruct x number of days with date? For example My_Date=`date` Now I need Hope it's clear. (2 Replies)
Discussion started by: Anupam_Halder
2 Replies

4. Shell Programming and Scripting

Changing date format in CSV file

I have a CSV file with a date format like this; 11/19/2012 17:37:00,1.372,121.6 11/19/2012 17:38:00,0.743,121.6 Want to change the time stamp to seconds after 1970 so I can get the data in rrdtool. For anyone interested, this is data from a TED5000 unit and is Kwatts and volts. Needs to... (3 Replies)
Discussion started by: ottsm
3 Replies

5. Shell Programming and Scripting

SQL one liner for subtracting 1 from date

I got a statement like below to subtract 1 from given date using teradata. I am looking for a one line unix command to perform the same. select 'parse_this_record', (DATE '${FILE_DATE}' - 1) (FORMAT 'YYYY-MM-DD'); Input: 2012-02-21 Expected Output: 2012-02-20 PS: One liner because I am... (2 Replies)
Discussion started by: siteregsam
2 Replies

6. UNIX for Dummies Questions & Answers

mv folders/files without changing modified date?

Hi all, I'm using Red Hat Linux and want to move some folders and files around but not change the modified date. Is this possible? I know cp has a -p flag which seems to do what I want, but this is a large volume of data so copying and deleting would not be feasible. (13 Replies)
Discussion started by: Annorax
13 Replies

7. UNIX for Dummies Questions & Answers

adding or subtracting days in the o/p of date

how can we add or subtract days from the output of date command in unix... like if i want to subtract a day from the result of date command like this.. v_date=`date +%Y%m%d` this wud give me 20080519 now i want to subtract one day from this.. so tht it wud give me 20080518.. how do i do... (1 Reply)
Discussion started by: St.Fartatric
1 Replies

8. UNIX for Dummies Questions & Answers

Changing Creation Date to a Prespecified Date of a File In Unix

Dear Expert, Is there a command to do that in Unix? In such a way that we don't need to actually "write" or modified the content. -- monkfan (4 Replies)
Discussion started by: monkfan
4 Replies

9. Shell Programming and Scripting

Subtracting date / timestamps

I have looked through the forums and found many date / time manipulation tools, but cannot seem to find something that fits my needs for the following. I have a log file with date time stamps like this: Jun 21 17:21:52 Jun 21 17:24:56 Jun 21 17:27:59 Jun 21 17:31:03 Jun 21 17:34:07 Jun... (0 Replies)
Discussion started by: roadcyclist
0 Replies

10. Shell Programming and Scripting

subtracting a days from current date

Hi i am trying to subtract days from current date. For example todays date is 10/03/2006. If i subtract 2 days it should give 8/03/2006. I am also trying to find the access date of a file in dd/mm/yyyy format. Can any one please help in how to do this. Ramesh (1 Reply)
Discussion started by: rameshspal
1 Replies
Login or Register to Ask a Question