Fill in missing Data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fill in missing Data
# 22  
Old 03-20-2012
Quote:
Originally Posted by Corona688
What threw me is you showed what you wanted like "a,,c", filling in b inbetween. That had me thinking you wanted it filled in between the same rows, which of course makes no sense -- server b's data has nothing to do with a's or c's. But if you wanted extrapolation based on past data for that column, that makes more sense.

Here's some interpolation across rows. It has to load the whole dataset into memory though.

Code:
$ cat fill2.awk
 
BEGIN { FS="," ; OFS="," }
 
{       COLS=NF
        for(N=1; N<=NF; N++)    D[NR,N]=$N      }
 
END {
        for(N=2; N<=COLS; N++)
        {
#               printf("column %d\n", N);
 
                S=1
                for(L=1; L<=NR; L++)
                {
                        if(!D[L,N]) continue;
 
                        if(((L-S) < 2) || (!D[S,N]))
                        {
                                S=L;
                                continue;
                        }
 
#                       printf("[%s]", D[S,N]);
 
                        SLOPE=(D[L,N]-D[S,N])/(L-S);
 
                        for(M=S+1; M<L; M++)
                        {
                                X=M-S
                                D[M,N]=D[S,N]+(SLOPE*X);
#                               printf("\t'%s'", D[M,N]);
                        }
 
#                       printf("\t[%s]\n", D[L,N]);
 
                        S=L;
                }
        }
 
        for(L=1; L<=NR; L++)
        {
                printf("%s", D[L, 1]);
 
                for(N=2; N<=COLS; N++)  printf(OFS "%s", D[L,N]);
 
                printf("\n");
        }
}
 
$ awk -f fill2.awk data
 
Wed Feb 01 09:00:02 EST 2012,,,0.4,0.3,,0.3,,0.3,,0.5,,0.3,,,0.4,0.3,
Wed Feb 01 09:00:11 EST 2012,,,0.45,0.35,,0.35,0.2,0.35,,0.55,,0.35,,,0.45,0.35,
Wed Feb 01 09:00:22 EST 2012,0.7,0.4,0.5,0.4,0.3,0.4,,0.4,0.5,0.6,0.3,0.4,,0.1,0.5,0.4,
Wed Feb 01 09:00:42 EST 2012,0.9,1.0,0.6,0.5,0.366667,0.5,,0.433333,0.55,0.625,0.325,0.5,,,0.525,0.425,
Wed Feb 01 09:01:02 EST 2012,1.0,0.8,0.65,0.55,0.433333,0.6,,0.466667,0.6,0.65,0.35,0.6,,,0.55,0.45,
Wed Feb 01 09:01:22 EST 2012,0.95,0.7,0.7,0.6,0.5,,,0.5,0.65,0.675,0.375,0.55,,,0.575,0.475,
Wed Feb 01 09:01:42 EST 2012,0.9,0.6,0.6,0.5,,,,0.4,0.7,0.7,0.4,0.5,,,0.6,0.5,
Wed Feb 01 09:02:02 EST 2012,0.8,,0.5,0.4,,,,1.1,,0.65,0.3,0.4,,,,,
Wed Feb 01 09:02:22 EST 2012,0.9,,,,,,,0.7,,0.6,,,,,,,
 
$

It's still full of holes like swiss cheese wherever there was only one end to the data...

Tried the script out but was getting the following errors...

Code:
./awk.awk: line 1: BEGIN: command not found
: command not found}
: command not found
./awk.awk: line 4: syntax error near unexpected token `('
'/awk.awk: line 4: `        for(N=1; N<=NF; N++)    D[NR,N]=$N      }

# 23  
Old 03-20-2012
I've never heard of an awk that doesn't even have begin! I think the file got mangled between you copying it and running it somehow.

You are running it in awk, yes? Not as a shell script? Run it as I showed you -- awk -f filename.awk datafile

Also, try nawk; it could be version incompatibilities but I doubt it.
# 24  
Old 03-21-2012
i ran it just like you said. gave me a lot of syntax errors....

Code:
-bash-3.2$ awk -f awk.awk file.csv
awk: awk.awk:7:         for(N=2; N<=COLS; N++)
awk: awk.awk:7:         ^ syntax error
awk: awk.awk:7:         for(N=2; N<=COLS; N++)
awk: awk.awk:7:                ^ syntax error
awk: awk.awk:7:         for(N=2; N<=COLS; N++)
awk: awk.awk:7:                         ^ syntax error
awk: awk.awk:8:         for(N=2; N<=COLS; N++)
awk: awk.awk:8:                                ^ unexpected newline or end of string
awk: awk.awk:12:                 S=1
awk: awk.awk:12:                     ^ unexpected newline or end of string
awk: awk.awk:12:                 for(L=1; L<=NR; L++)
awk: awk.awk:12:                        ^ syntax error
awk: awk.awk:12:                 for(L=1; L<=NR; L++)
awk: awk.awk:12:                               ^ syntax error
awk: awk.awk:13:                 for(L=1; L<=NR; L++)
awk: awk.awk:13:                                      ^ unexpected newline or end of string
awk: awk.awk:14:                         if(!D[L,N]) continue;
awk: awk.awk:14:                                     ^ syntax error
awk: awk.awk:17:                         if(((L-S) < 2) || (!D[S,N]))
awk: awk.awk:17:                                                      ^ unexpected newline or end of string
awk: awk.awk:18:                                 S=L;
awk: awk.awk:18:                                    ^ syntax error
awk: awk.awk:24:                         SLOPE=(D[L,N]-D[S,N])/(L-S);
awk: awk.awk:24:                                                    ^ syntax error
awk: awk.awk:26:                         for(M=S+1; M<L; M++)
awk: awk.awk:26:                                  ^ syntax error
awk: awk.awk:26:                         for(M=S+1; M<L; M++)
awk: awk.awk:26:                                       ^ syntax error
awk: awk.awk:27:                         for(M=S+1; M<L; M++)
awk: awk.awk:27:                                              ^ unexpected newline or end of string
awk: awk.awk:29:                                 X=M-S
awk: awk.awk:29:                                       ^ unexpected newline or end of string
awk: awk.awk:29:                                 D[M,N]=D[S,N]+(SLOPE*X);
awk: awk.awk:29:                                                        ^ syntax error
awk: awk.awk:35:                         S=L;
awk: awk.awk:35:                            ^ syntax error
awk: awk.awk:39:         for(L=1; L<=NR; L++)
awk: awk.awk:39:                ^ syntax error
awk: awk.awk:39:         for(L=1; L<=NR; L++)
awk: awk.awk:39:                       ^ syntax error
awk: awk.awk:40:         for(L=1; L<=NR; L++)
awk: awk.awk:40:                              ^ unexpected newline or end of string
awk: awk.awk:41:                 printf("%s", D[L, 1]);
awk: awk.awk:41:                                      ^ syntax error
awk: awk.awk:43:                 for(N=2; N<=COLS; N++)  printf(OFS "%s", D[L,N]);
awk: awk.awk:43:                        ^ syntax error
awk: awk.awk:43:                 for(N=2; N<=COLS; N++)  printf(OFS "%s", D[L,N]);
awk: awk.awk:43:                                 ^ syntax error
awk: awk.awk:43:                 for(N=2; N<=COLS; N++)  printf(OFS "%s", D[L,N]);
awk: awk.awk:43:                                         ^ syntax error
awk: awk.awk:43:                 for(N=2; N<=COLS; N++)  printf(OFS "%s", D[L,N]);
awk: awk.awk:43:                                                                 ^ syntax error
awk: awk.awk:45:                 printf("\n");
awk: awk.awk:45:                             ^ syntax error

---------- Post updated at 08:47 AM ---------- Previous update was at 08:43 AM ----------

i recreated the file within shell and i think its now working

---------- Post updated at 09:45 AM ---------- Previous update was at 08:47 AM ----------

everything works like a charm. Thank you for the help!!
This User Gave Thanks to Mikes88 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Fill in missing values

Hi, I have a data sample as shown below. I want to fill in the left column so that the line will be continuous. For example, between 1 and 5 should be 2,3,4. And corresponding values in the right column will be 0. Thus the expected data should look like that: 1 1 1 10 1 2 1 3 1 5 1 6 2 0... (6 Replies)
Discussion started by: theanh0508
6 Replies

2. UNIX for Beginners Questions & Answers

Fill in missing hours and interpolate values using awk.

I have a time series data like this 40754,35.6931,51.3092,201610160700,21.0 40754,35.6931,51.3092,201610160800,23.0 40754,35.6931,51.3092,201610160900,24.0 40754,35.6931,51.3092,201610161000,24.0 40754,35.6931,51.3092,201610161300,25.0 40754,35.6931,51.3092,201610161400,23.0... (6 Replies)
Discussion started by: emirzaei
6 Replies

3. Shell Programming and Scripting

Fill in missing rows with zero to have uniform table

Hello, I have two files of same structure except some rows are missing randomly in each file. How do I fill the missing rows to have the exact ID column (S01 ~ S96) and rest columns filled with "0" with awk? The purpose of this step is to join the two files side by side. The closest thread is... (17 Replies)
Discussion started by: yifangt
17 Replies

4. Shell Programming and Scripting

Fill data if number range is given

Hi I want to get all numbers if number range is given as input. Eg: INPUT FILE 100-105 107 108-112 OUTPUT REQUIRED: 100 101 102 103 104 105 107 108 109 110 111 112 How can I do it using shell? :confused: Thanks in advance. (11 Replies)
Discussion started by: dashing201
11 Replies

5. Shell Programming and Scripting

fill in last column of data

Hello, I am fairly new to awk, and I have the following problem. My file has missing data in the last column, and the program I am pre-processing this file for cannot interpret correctly shortened rows (it just wraps the data around). Is there a way to force awk to create the same... (6 Replies)
Discussion started by: timert34
6 Replies

6. Shell Programming and Scripting

Fill missing values with 2

Hi All, I have 100 .txt files which look like this: 3 4 5 6 7 Now, some files have some numbers missing in them and they look like this: 4 5 6 (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

7. Shell Programming and Scripting

Fill missing numbers in second column with zeros

Hi All, I have 100 files with names like this: 1.dat, 2.dat, 3.dat until 100.dat. My dat files look like this: 42323 0 438939 1 434 0 0.9383 3434 120.23 3 234 As you can see in the second column, some numbers are missing. I want to fill those missing places with 0's in all... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

8. Shell Programming and Scripting

fill in missing columns

It can't be that hard, but I just can't figure it out: I have file like: File Sub-brick M_1 S_1 M_2 S_2 M_4 S_4 ... xxx 1 214 731 228 621 132 578 ... and would like to get 0 0 where M_3 S_3 is missing xxx 1 214 731 228 621 0 0 132 578 ... I wrote following script, but can't figure out... (3 Replies)
Discussion started by: avvk
3 Replies

9. Shell Programming and Scripting

Fill in missing numbers in range

I need to edit a list of numbers on the following form: 1 1.0 2 1.4 5 2.1 7 1.9 I want: 1 1.0 2 1.4 3 0.0 4 0.0 5 2.1 6 0.0 7 1.9 (i want to add the missing number in column 1 together with 0.0 in column 2). I guess it is rather trivial but i didn't even manage to read column... (5 Replies)
Discussion started by: bistru
5 Replies
Login or Register to Ask a Question