Fill in missing hours and interpolate values using awk.

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Fill in missing hours and interpolate values using awk.
# 1  
Old 10-26-2016
Fill in missing hours and interpolate values using awk.

I have a time series data like this
Code:
40754,35.6931,51.3092,201610160700,21.0
40754,35.6931,51.3092,201610160800,23.0
40754,35.6931,51.3092,201610160900,24.0
40754,35.6931,51.3092,201610161000,24.0
40754,35.6931,51.3092,201610161300,25.0
40754,35.6931,51.3092,201610161400,23.0
40754,35.6931,51.3092,201610161500,22.0

that 4th column is hourly dates. i have gap in 1000 to 13000 and i want to fill this gaps with 5th columns interpolation.
can i do this with awk? to have:
Code:
40754,35.6931,51.3092,201610160700,21.0
40754,35.6931,51.3092,201610160800,23.0
40754,35.6931,51.3092,201610160900,24.0
40754,35.6931,51.3092,201610161000,24.0
40754,35.6931,51.3092,201610161100,24.3
40754,35.6931,51.3092,201610161200,24.6
40754,35.6931,51.3092,201610161300,25.0
40754,35.6931,51.3092,201610161400,23.0
40754,35.6931,51.3092,201610161500,22.0

Moderator's Comments:
Mod Comment This post was moved to a new thread from a thread discussing an unrelated topic.

And, please use CODE tags when displaying sample input, sample output, and code segments (as required by forum rules).

Last edited by Don Cragun; 10-26-2016 at 04:05 AM.. Reason: Move to new thread and add CODE tags.
# 2  
Old 10-26-2016
Any attempts/ideas/thoughts from your side? Any preferred tool?
# 3  
Old 10-26-2016
I should do it in two steps:
1- filling the gaps maybe by reading the file in a date loop and add the gap lines
2- get the values for new lines by interpolation.

I could use this solution
Code:
awk '{x=$1-b;while(x-->1){print ++b," 40754,35.6931,51.3092,YYYYMMDDHHSS,0"};b=$1}1' file

but i don't know how could I make loop with date format(YYYYMMDDHHSS)?



Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 10-26-2016 at 05:35 AM.. Reason: Added CODE tags.
# 4  
Old 10-26-2016
How about
Code:
awk -F, '
NR > 1 &&
$4 > L4+100     {CNT4 = ($4 - L4) / 100
                 STP5 = ($5 - L5) / CNT4
                 for (i=1; i<=CNT4; i++)        {$4 = L4 + 100 * i
                                                 $5 = sprintf ("%.1f", L5 + STP5 * i)
                                                 if (i < CNT4) print
                                                }

                }
                {L4 = $4
                 L5 = $5
                 print
                }
' OFS="," CONVFMT="%.0f" file
40754,35.6931,51.3092,201610160700,21.0
40754,35.6931,51.3092,201610160800,23.0
40754,35.6931,51.3092,201610160900,24.0
40754,35.6931,51.3092,201610161000,24.0
40754,35.6931,51.3092,201610161100,24.3
40754,35.6931,51.3092,201610161200,24.7
40754,35.6931,51.3092,201610161300,25.0
40754,35.6931,51.3092,201610161400,23.0
40754,35.6931,51.3092,201610161500,22.0

Note: it doesn't allow for crossing midnight.
This User Gave Thanks to RudiC For This Post:
# 5  
Old 10-26-2016
Thanks
It is excellent :-)
# 6  
Old 10-26-2016
it is work only for one day, if have two days data, it dos not work!
YYYYMMDDHHmm is date and the time and date range of each field is important.
Code:
40754,35.6931,51.3092,201612312100,24.3
40754,35.6931,51.3092,201612312200,24.7
40754,35.6931,51.3092,201612312300,25.0
40754,35.6931,51.3092,201701010100,23.0
40754,35.6931,51.3092,201701010200,22.0


Moderator's Comments:
Mod Comment
Please wrap code and data input/output in CODE tags.
It make it far easier to read and preserves multiple spaces for indenting or fixed width data.

Last edited by rbatte1; 10-26-2016 at 10:56 AM.. Reason: Added CODE tags
# 7  
Old 10-26-2016
Quote:
Originally Posted by emirzaei
it is work only for one day, if have two days data, it dos not work!
YYYYMMDDHHmm is date and the time and date range of each field is important.
Code:
40754,35.6931,51.3092,201612312100,24.3
40754,35.6931,51.3092,201612312200,24.7
40754,35.6931,51.3092,201612312300,25.0
40754,35.6931,51.3092,201701010100,23.0
40754,35.6931,51.3092,201701010200,22.0


Moderator's Comments:
Mod Comment
Please wrap code and data input/output in CODE tags.
It make it far easier to read and preserves multiple spaces for indenting or fixed width data.
Hi emirzaei,
RudiC did include this note with his suggestion:
Quote:
Note: it doesn't allow for crossing midnight.
Why don't you try to extend the awk script RudiC supplied to so it can handle missing lines that cross date boundaries

If you get it to work, post your script (in CODE tags) so others reading this thread can learn from what you have done.

If you can't get it to work, show us what you have tried (in CODE tags) and show us what it is doing (in CODE tags) and explain what still needs to be done. We are here to help you learn how to write your own code; not to act as your unpaid programming staff updating suggestions to meet changing requirements after suggestions are posted.

Can we assume that there will never be a string of missing entries that extends for more than 23 hours? (If we can make that assumption, we don't have to perform any arithmetic on dates; we just have to note that the date changed between two entries and the hours on adjacent entries aren't sequential.)

What operating system and shell are you using? (Tools available for performing arithmetic on dates varies considerably from system to system and shell to shell.) Telling us what operating system and shell you're using in the 1st post in a thread is always a good idea. Doing so will lead to fewer suggestions being made that will not work in your environment.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Fill in missing values

Hi, I have a data sample as shown below. I want to fill in the left column so that the line will be continuous. For example, between 1 and 5 should be 2,3,4. And corresponding values in the right column will be 0. Thus the expected data should look like that: 1 1 1 10 1 2 1 3 1 5 1 6 2 0... (6 Replies)
Discussion started by: theanh0508
6 Replies

2. Shell Programming and Scripting

Fill in missing rows with zero to have uniform table

Hello, I have two files of same structure except some rows are missing randomly in each file. How do I fill the missing rows to have the exact ID column (S01 ~ S96) and rest columns filled with "0" with awk? The purpose of this step is to join the two files side by side. The closest thread is... (17 Replies)
Discussion started by: yifangt
17 Replies

3. Shell Programming and Scripting

Need help looking for missing hours.

I have a file that should cover a days worth of stats, at the beginning of each 15 minute report I have a unique header that looks like the below example. The "0000" and "0015" will change in the header line to show which 15 Minute interval the report is covering and of course from day to day the... (7 Replies)
Discussion started by: fsanchez
7 Replies

4. Shell Programming and Scripting

Fill in missing Data

hello everyone, I have a task to input missing data into a file. example of my data below: Wed Feb 01 09:00:02 EST 2012,,,0.4,0.3,,0.3,,0.3,,0.5,,0.3,,,0.4,0.3, Wed Feb 01 09:00:11 EST 2012,,,,,,,0.2,,,,,,,,,, Wed Feb 01 09:00:22 EST... (23 Replies)
Discussion started by: Nolph
23 Replies

5. Shell Programming and Scripting

Fill missing values with 2

Hi All, I have 100 .txt files which look like this: 3 4 5 6 7 Now, some files have some numbers missing in them and they look like this: 4 5 6 (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

6. Shell Programming and Scripting

Fill missing numbers in second column with zeros

Hi All, I have 100 files with names like this: 1.dat, 2.dat, 3.dat until 100.dat. My dat files look like this: 42323 0 438939 1 434 0 0.9383 3434 120.23 3 234 As you can see in the second column, some numbers are missing. I want to fill those missing places with 0's in all... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

7. Shell Programming and Scripting

Fill the values between -500 to 500 -awk

input -200 2.4 0 2.6 30 2.8 output -500 0 -499 0 -488 0 .......... .......... .... -200 2.4 .... ... 0 2.6 (6 Replies)
Discussion started by: quincyjones
6 Replies

8. Shell Programming and Scripting

fill in missing columns

It can't be that hard, but I just can't figure it out: I have file like: File Sub-brick M_1 S_1 M_2 S_2 M_4 S_4 ... xxx 1 214 731 228 621 132 578 ... and would like to get 0 0 where M_3 S_3 is missing xxx 1 214 731 228 621 0 0 132 578 ... I wrote following script, but can't figure out... (3 Replies)
Discussion started by: avvk
3 Replies

9. Shell Programming and Scripting

Fill in missing numbers in range

I need to edit a list of numbers on the following form: 1 1.0 2 1.4 5 2.1 7 1.9 I want: 1 1.0 2 1.4 3 0.0 4 0.0 5 2.1 6 0.0 7 1.9 (i want to add the missing number in column 1 together with 0.0 in column 2). I guess it is rather trivial but i didn't even manage to read column... (5 Replies)
Discussion started by: bistru
5 Replies
Login or Register to Ask a Question