Find gaps in time data and replace missing time value and column 2 value by interpolation in awk
Dear all,
I am kindly seeking assistance on the following issue.
I am working with data that is sampled every 0.05 hours (that is 3 minutes intervals) here is a sample data from the file
As you can see my data has a big gap here
I am trying to find a way to locate such a gap and replace the missing time values on column 1 and create corresponding values of column 2 for each missing time value by interpolation. Something like
I am processing my data with AWK, and so an awk solution would be easier to integrate into my code.
PLEASE NOTE: that I am creating the data by resampling some other data that looks like this
using the following code
this code handles small gaps but not big ones like in this situation.
I am putting this here so maybe some one can just help me modify the code to handle this gap problem
This seems very similar to several earlier requests in these forums. A good one to look at for ideas would be XY interpolation by time in awk. If that doesn't give you what you need, the last post in that thread provides pointers to several similar threads.
Please look at those first, and if you can't find anything in them that helps, explain how this case is different from the problem solved in the above thread and we'll try to help you get a working solution.
Thank you for your reply. I have read and seen several "almost similar issues" including the "XY interpolation by time in awk" you have referred to. I have spent several days trying to see how I can adapt the solution there to my problem and am still trying but being relatively new to awk programming, I have not managed to get anywhere. My problem here is relatively different from that in that, that data is quite different from mine here. In addition, my time here is some floating point values, I have seen some posts in "Expand & Interpolation" which I thought could help by since for them they needed integer values, the issue is simpler as it can be handled by integer in the for loop. I have spent several days going through several examples which I thought could be similar to my situation but each of them seems different
I will really appreciate any help given
Last edited by malandisa; 01-15-2015 at 11:30 PM..
Please be aware that your data are not a linear function of time; on top of some noise it has a small curvature, and the six data points right after the gap are somewhat lower than they should be (there's a jump in values but not in time delta at point 7).
This is a quick and dirty approximation to exactly your problem and data; no error checking etc. is done. It's sort of a linear interpolation between the given boundaries although we know the boundary to the right is questionable. On top, my mawk has a problem with the D1 > D0 comparison, sometimes the delta is -1E-16, sometimes it's +71E-16, so a few extra lines are being "interpolated". I don't have a good solution at hand; either increase the to be compared value slightly (yuck!) or use sort -u on the result (yuck!)...
However, try
Thank you. this solution works and does exactly what I really needed. The interpolation is just alright. As you can see this is time series data for some quantity that depends on the sun, which exhibits higher values during the day that at night. this is morning and the variation of the quantity more complex due to too many factors. So this interpolation does just fine. I have just increased the precision of the D0/D1 to include more significant figures and that solves the problem you indicated with the
.
Again may I say thank you and am sure this will help others as well.
Hi I am a newbie in awk scripting.
I'm working with a file with xy coordinates that were acquired with a time stamp. All the time stamps were recorded but not the XY coordinates. Let see an example:
FFID X Y UNIX TIME TIMEGAP... (8 Replies)
I have some time series data that I need to resample or downsample at some specific time intervals. The firs column is time in decimal hours. I am tryiong to resample this data every 3 minutse. So I need a data value ever 0.05. Here is the example data and as you can see, there time slot for 0.1500... (3 Replies)
Hi Experts ,
I need your help to collect the complete data between two time frame from the log files, when I try awk it's collecting the data only which is printed with time stamp
for example, awk works well from "16:00 to 17:30" but its not collecting <line*> "from 17:30 to 18:00"
... (8 Replies)
I'd like to convert a date string in the form of sun aug 19 09:03:10 EDT 2012, to unixtime timestamp using awk.
I tried
This is how each line of the file looks like, different date and time in this format
Sun Aug 19 08:33:45 EDT 2012, user1(108.6.217.236) all: test on the 17th
... (2 Replies)
Hi All,
I need help in manipulating the data in first column in a file.
The sample data looks like below,
Mon Jul 18 00:32:52 EDT 2011,NULL,UAT
Jul 19 2011,NULL,UAT
1] All field in the file are separated by ","
2] File is having weekly data extracted from database
3] For eg.... (8 Replies)
Hello All -
I have a script that grabs data from the net and outputs the following data
46029 46.144 -124.510 2010 07 26 22 50 320 4.0 6.0 2.2 9 6.8 311 1012.1 -0.9 13.3 13.5 13.3 - -
46041 47.353 -124.731 2010 07 26 22 50 250 2.0 3.0 1.6 8 6.4 - 1011.6 - ... (0 Replies)
I am very new to shell scripting. We use C-Shell here and I know the issues that surround it. I hope a solution can be created using awk, sed, etc... instead of having to write a program.
I have an input file that is sorted by date and time in ascending order
... (2 Replies)
Hi,
Can anyone help me how can I get the line that between the start time and end time.
file1.txt
15/03/2009 20:45:03 Request: - Data of this line
15/03/2009 20:45:12 Response: - Data of this line
15/03/2009 22:10:40 Request: - Data of this line
15/03/2009 22:10:42 Response: - Data of... (1 Reply)
Hi,
I have two time series data (below) merged into a file.
t1 and t2 are in unit of second
I want to calculate the average of V1 every second and count how many times "1" in V2 is occur within a second
Input File:
t1 V1 t2 V2
10.000000... (5 Replies)