Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

awk matching script not working as expected

Shell Programming and Scripting


Tags
awk bash perl ubuntu gnu

Reply    
 
Thread Tools Search this Thread Display Modes
    #8  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
Chubler_XL's Unix or Linux Image
Chubler_XL Chubler_XL is offline Forum Staff  
Moderator
 
Join Date: Oct 2010
Last Activity: 18 April 2018, 9:54 PM EDT
Posts: 3,512
Thanks: 154
Thanked 1,249 Times in 1,145 Posts
How about this:



Code:
#!/usr/bin/awk -f
FNR==1 {file++}
{
  day=$1
  gsub(/-/, " ", day)
  split($2, t, ".")
  gsub(/:/, " ", t[1])
  x=mktime(day " " t[1]) + t[2] / 1000
  if(file==1) srctime[FNR]=x
  else desttime[FNR]=x
  records[file, FNR]=$0
}

END {
   offset=5*60
   max=2*60
   cur=1
   for (rec in srctime) {
       target = srctime[rec] + offset
       offsetmin = target - max
       offsetmax = target + max
       best = 9999999
       found = 0
       while(cur in desttime && desttime[cur] < offsetmax) {
           if (desttime[cur] < target && desttime[cur] > offsetmin &&
               best > target - desttime[cur]) {
                  best= target - desttime[cur]
                  found=cur
           }
           if (desttime[cur] >= target) {
              if(best > desttime[cur] - target) {
                  best=desttime[cur] - target
                  found=cur
               }
               break
           }
           cur++
        }

        if (found)
           print records[1, rec] " " records[2, found]
        else
           print records[1, rec] " NA NA"
    }
}


Result:


Code:
2018-02-16 16:45:29.557 farads 0.0004300000 2018-02-16 16:50:40.486 reactance 0.0002400000
2018-02-16 16:46:09.300 farads 0.0004300000 2018-02-16 16:51:22.525 reactance 0.0005900000
2018-02-16 16:47:10.987 farads 0.0002800000 2018-02-16 16:52:01.997 reactance 0.0003900000
2018-02-16 16:47:51.611 farads 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:47:51.612 farads 0.0006500000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:48:34.077 farads 0.0006600000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:49:17.015 farads 0.0003300000 2018-02-16 16:54:03.276 reactance 0.0005300000
2018-02-16 16:49:59.075 farads 0.0000700000 2018-02-16 16:54:44.223 reactance 0.0003800000
2018-02-16 16:50:40.486 farads 0.0002400000 2018-02-16 16:55:24.769 reactance 0.0003200000
2018-02-16 16:51:22.525 farads 0.0005900000 2018-02-16 16:56:10.028 reactance 0.0002700000
2018-02-16 16:52:01.997 farads 0.0003900000 2018-02-16 16:56:57.624 reactance 0.0000900000
2018-02-16 16:52:43.612 farads 0.0005200000 2018-02-16 16:57:37.387 reactance 0.0003000000
2018-02-16 16:53:23.550 farads 0.0003900000 2018-02-16 16:58:16.929 reactance 0.0005800000
2018-02-16 16:54:03.276 farads 0.0005300000 2018-02-16 16:58:56.961 reactance 0.0003000000



Edit: previous solution could miss closer records that are before previous target this should be more accurate:



Code:
#!/usr/bin/awk -f
FNR==1 {file++}
{
  day=$1
  gsub(/-/, " ", day)
  split($2, t, ".")
  gsub(/:/, " ", t[1])
  x=mktime(day " " t[1]) + t[2] / 1000
  if(file==1) srctime[FNR]=x
  else desttime[FNR]=x
  records[file, FNR]=$0
}

END {
   offset=5*60
   max=2*60
   deststart=0
   for (rec in srctime) {
       target = srctime[rec] + offset
       offsetmin = target - max
       offsetmax = target + max
       best = 9999999
       found = 0
       cur=deststart+1
       while(cur in desttime && desttime[cur] < offsetmax) {
           if (desttime[cur] < target && desttime[cur] > offsetmin &&
               best > target - desttime[cur]) {
                  if( best = 9999999) deststart = cur
                  best= target - desttime[cur]
                  found=cur
           }
           if (desttime[cur] >= target) {
              if(best > desttime[cur] - target) {
                  best=desttime[cur] - target
                  found=cur
               }
               break
           }
           cur++
        }

        if (found)
           print records[1, rec] " " records[2, found]
        else
           print records[1, rec] " NA NA"
    }
}


Last edited by Chubler_XL; 4 Weeks Ago at 02:02 AM..
The Following User Says Thank You to Chubler_XL For This Useful Post:
delbroooks (4 Weeks Ago)
Sponsored Links
    #9  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
delbroooks's Unix or Linux Image
delbroooks delbroooks is offline
Registered User
 
Join Date: Mar 2018
Last Activity: 26 March 2018, 1:26 PM EDT
Posts: 8
Thanks: 3
Thanked 0 Times in 0 Posts
What command did you use at the command line to run the code?


Code:
gawk --lint -f awkscript4  < file1.txt file2.txt | less

prints the following:



Code:
2018-02-16 16:46:09.300 reactance 0.0004300000 2018-02-16 16:51:22.525 reactance 0.0005900000
2018-02-16 16:47:10.987 reactance 0.0002800000 2018-02-16 16:52:01.997 reactance 0.0003900000
2018-02-16 16:47:51.611 reactance 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:47:51.612 reactance 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:48:34.077 reactance 0.0006600000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:49:17.015 reactance 0.0003300000 2018-02-16 16:54:03.276 reactance 0.0005300000
2018-02-16 16:49:59.075 reactance 0.0000700000 2018-02-16 16:54:44.223 reactance 0.0003800000
2018-02-16 16:50:40.486 reactance 0.0002400000 2018-02-16 16:55:24.769 reactance 0.0003200000
2018-02-16 16:51:22.525 reactance 0.0005900000 2018-02-16 16:56:10.028 reactance 0.0002700000
2018-02-16 16:52:01.997 reactance 0.0003900000 2018-02-16 16:56:57.624 reactance 0.0000900000
2018-02-16 16:52:43.612 reactance 0.0005200000 2018-02-16 16:57:37.387 reactance 0.0003000000
2018-02-16 16:53:23.550 reactance 0.0003900000 2018-02-16 16:58:16.929 reactance 0.0005800000
2018-02-16 16:54:03.276 reactance 0.0005300000 2018-02-16 16:58:56.961 reactance 0.0003000000
2018-02-16 16:54:44.223 reactance 0.0003800000 2018-02-16 16:59:39.217 reactance 0.0001900000
2018-02-16 16:55:24.769 reactance 0.0003200000 2018-02-16 17:00:19.129 reactance 0.0005800000
2018-02-16 16:56:10.028 reactance 0.0002700000 2018-02-16 17:00:59.328 reactance 0.0001500000
2018-02-16 16:56:57.624 reactance 0.0000900000 2018-02-16 17:01:39.138 reactance 0.0005400000
2018-02-16 16:57:37.387 reactance 0.0003000000 2018-02-16 17:02:19.786 reactance 0.0006600000
2018-02-16 16:58:16.929 reactance 0.0005800000 2018-02-16 17:03:00.236 reactance 0.0004700000
2018-02-16 16:58:56.961 reactance 0.0003000000 2018-02-16 17:03:44.343 reactance 0.0003300000
2018-02-16 16:59:39.217 reactance 0.0001900000 2018-02-16 17:04:24.996 reactance 0.0002200000
2018-02-16 17:00:19.129 reactance 0.0005800000 2018-02-16 17:05:05.754 reactance 0.0003200000
2018-02-16 17:00:59.328 reactance 0.0001500000 2018-02-16 17:05:48.512 reactance 0.0004600000
2018-02-16 17:01:39.138 reactance 0.0005400000 2018-02-16 17:06:29.248 reactance 0.0003700000
2018-02-16 17:02:19.786 reactance 0.0006600000 2018-02-16 17:07:09.819 reactance 0.0001300000
2018-02-16 17:03:00.236 reactance 0.0004700000 2018-02-16 17:07:50.392 reactance 0.0005500000
2018-02-16 17:03:44.343 reactance 0.0003300000 2018-02-16 17:08:32.397 reactance 0.0002000000
2018-02-16 17:04:24.996 reactance 0.0002200000 2018-02-16 17:09:14.778 reactance 0.0003000000
2018-02-16 17:05:05.754 reactance 0.0003200000 2018-02-16 17:09:57.688 reactance 0.0003100000


Last edited by delbroooks; 4 Weeks Ago at 12:55 PM..
Sponsored Links
    #10  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
Chubler_XL's Unix or Linux Image
Chubler_XL Chubler_XL is offline Forum Staff  
Moderator
 
Join Date: Oct 2010
Last Activity: 18 April 2018, 9:54 PM EDT
Posts: 3,512
Thanks: 154
Thanked 1,249 Times in 1,145 Posts
Try:



Code:
gawk --lint -f awkscript4  file1.txt file2.txt

or if you have the correct hash bang at the top of your script (something like #!/usr/bin/gawk -f) you can do:


Code:
$ chmod 755 awkscript4
$ ./awkscript4 file1.txt file2.txt

    #11  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
delbroooks's Unix or Linux Image
delbroooks delbroooks is offline
Registered User
 
Join Date: Mar 2018
Last Activity: 26 March 2018, 1:26 PM EDT
Posts: 8
Thanks: 3
Thanked 0 Times in 0 Posts
There is a bug in the code. I tried your suggestion, What I see is happening is file1 (containing farads) is matching with file1 itself. file 2 (containing reactance) is matching with file 2 itself and both are merging. I want file1 to match with file2 and print the matched values.

This is what is happening with


Code:
gawk --lint -f awkscript4  file1.txt file2.txt



Code:
 2018-02-17 00:05:40.967 farads 0.0001400000  2018-02-17 00:12:00.863 farads 0.0001600000
 2018-02-17 00:06:24.584 farads 0.0001000000  2018-02-17 00:12:00.863 farads 0.0001600000
 2018-02-17 00:07:04.742 farads 0.0002500000  2018-02-17 00:12:00.863 farads 0.0001600000
 2018-02-17 00:12:00.863 farads 0.0001600000  2018-02-17 00:16:56.912 farads 0.0002100000
 2018-02-17 00:12:41.023 farads 0.0002400000  2018-02-17 00:17:37.895 farads 0.0001800000
 2018-02-17 00:13:22.429 farads 0.0001500000  2018-02-17 00:18:18.354 farads 0.0003700000
 2018-02-17 00:14:04.826 farads 0.0004100000  2018-02-17 00:18:58.071 farads 0.0004700000
 2018-02-17 00:14:51.079 farads 0.0001600000  2018-02-17 00:18:58.071 farads 0.0004700000
 2018-02-17 00:15:31.247 farads 0.0003500000  2018-02-17 00:18:58.071 farads 0.0004700000
 2018-02-17 00:16:17.396 farads 0.0001900000 NA NA
 2018-02-17 00:16:56.912 farads 0.0002100000 NA NA
 2018-02-17 00:17:37.895 farads 0.0001800000 NA NA
 2018-02-17 00:18:18.354 farads 0.0003700000 NA NA
 2018-02-17 00:18:58.071 farads 0.0004700000 NA NA
 2018-02-17 18:19:38.135 farads 0.0002000000  2018-02-17 18:24:27.966 farads 0.0001800000
 2018-02-17 18:20:22.373 farads 0.0002600000  2018-02-17 18:25:11.832 farads 0.0002800000
 2018-02-17 18:21:02.161 farads 0.0003000000  2018-02-17 18:25:52.344 farads 0.0003000000
 2018-02-17 18:21:43.806 farads 0.0002700000  2018-02-17 18:26:33.672 farads 0.0002600000
 2018-02-17 18:22:25.394 farads 0.0002500000  2018-02-17 18:27:15.499 farads 0.0004300000
 2018-02-17 18:23:06.549 farads 0.0003100000  2018-02-17 18:27:55.288 farads 0.0004800000
 2018-02-17 18:23:46.638 farads 0.0002100000  2018-02-17 18:28:56.699 farads 0.0004200000
 2018-02-17 18:24:27.966 farads 0.0001800000  2018-02-17 18:29:40.909 farads 0.0002100000
 2018-02-17 18:25:11.832 farads 0.0002800000  2018-02-17 18:30:20.942 farads 0.0003400000
 2018-02-17 18:25:52.344 farads 0.0003000000  2018-02-17 18:31:03.937 farads 0.0003500000
 2018-02-17 18:26:33.672 farads 0.0002600000  2018-02-17 18:31:51.329 farads 0.0002500000
 2018-02-17 18:27:15.499 farads 0.0004300000  2018-02-17 18:32:32.608 farads 0.0005000000
 2018-02-17 18:27:55.288 farads 0.0004800000  2018-02-17 18:33:12.869 farads 0.0004900000
 2018-02-17 18:28:56.699 farads 0.0004200000  2018-02-17 18:33:52.725 farads 0.0002300000
 2018-02-17 18:29:40.909 farads 0.0002100000  2018-02-17 18:34:39.022 farads 0.0001300000
 2018-02-17 18:30:20.942 farads 0.0003400000  2018-02-17 18:35:20.579 farads 0.0002800000
 2018-02-17 18:31:03.937 farads 0.0003500000  2018-02-17 18:36:00.487 farads 0.0002400000
 2018-02-17 18:31:51.329 farads 0.0002500000  2018-02-17 18:36:51.908 farads 0.0004500000
 2018-02-17 18:32:32.608 farads 0.0005000000  2018-02-17 18:37:33.667 farads 0.0002500000
 2018-02-17 18:33:12.869 farads 0.0004900000  2018-02-17 18:38:13.989 farads 0.0004700000
 2018-02-17 18:33:52.725 farads 0.0002300000  2018-02-17 18:38:53.753 farads 0.0003500000
 2018-02-17 18:34:39.022 farads 0.0001300000  2018-02-17 18:39:34.052 farads 0.0004100000
 2018-02-17 18:35:20.579 farads 0.0002800000  2018-02-17 18:39:34.052 farads 0.0004100000
 2018-02-17 18:36:00.487 farads 0.0002400000  2018-02-17 18:39:34.052 farads 0.0004100000
 2018-02-17 18:36:51.908 farads 0.0004500000 NA NA
 2018-02-17 18:37:33.667 farads 0.0002500000 NA NA
 2018-02-17 18:38:13.989 farads 0.0004700000 NA NA
 2018-02-17 18:38:53.753 farads 0.0003500000 NA NA
 2018-02-17 18:39:34.052 farads 0.0004100000 NA NA
 NA NA
2018-02-16 16:46:09.300 reactance 0.0004300000 2018-02-16 16:51:22.525 reactance 0.0005900000
2018-02-16 16:47:10.987 reactance 0.0002800000 2018-02-16 16:52:01.997 reactance 0.0003900000
2018-02-16 16:47:51.611 reactance 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:47:51.612 reactance 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:48:34.077 reactance 0.0006600000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:49:17.015 reactance 0.0003300000 2018-02-16 16:54:03.276 reactance 0.0005300000
2018-02-16 16:49:59.075 reactance 0.0000700000 2018-02-16 16:54:44.223 reactance 0.0003800000
2018-02-16 16:50:40.486 reactance 0.0002400000 2018-02-16 16:55:24.769 reactance 0.0003200000
2018-02-16 16:51:22.525 reactance 0.0005900000 2018-02-16 16:56:10.028 reactance 0.0002700000
2018-02-16 16:52:01.997 reactance 0.0003900000 2018-02-16 16:56:57.624 reactance 0.0000900000
2018-02-16 16:52:43.612 reactance 0.0005200000 2018-02-16 16:57:37.387 reactance 0.0003000000
2018-02-16 16:53:23.550 reactance 0.0003900000 2018-02-16 16:58:16.929 reactance 0.0005800000
2018-02-16 16:54:03.276 reactance 0.0005300000 2018-02-16 16:58:56.961 reactance 0.0003000000

Actually the match should be like this where lines that require match are the first four columns that come from file1 and the matched values are the last four columns that come from file2


Code:
2018-02-16 16:45:29.557 farads 0.0004300000 2018-02-16 16:50:40.486 reactance 0.0002400000
2018-02-16 16:46:09.300 farads 0.0004300000 2018-02-16 16:51:22.525 reactance 0.0005900000
2018-02-16 16:47:10.987 farads 0.0002800000 2018-02-16 16:52:01.997 reactance 0.0003900000
2018-02-16 16:47:51.611 farads 0.0006500000 2018-02-16 16:52:43.612 reactance 0.0005200000
2018-02-16 16:47:51.612 farads 0.0006500000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:48:34.077 farads 0.0006600000 2018-02-16 16:53:23.550 reactance 0.0003900000
2018-02-16 16:49:17.015 farads 0.0003300000 2018-02-16 16:54:03.276 reactance 0.0005300000
2018-02-16 16:49:59.075 farads 0.0000700000 2018-02-16 16:54:44.223 reactance 0.0003800000
2018-02-16 16:50:40.486 farads 0.0002400000 2018-02-16 16:55:24.769 reactance 0.0003200000
2018-02-16 16:51:22.525 farads 0.0005900000 2018-02-16 16:56:10.028 reactance 0.0002700000
2018-02-16 16:52:01.997 farads 0.0003900000 2018-02-16 16:56:57.624 reactance 0.0000900000
2018-02-16 16:52:43.612 farads 0.0005200000 2018-02-16 16:57:37.387 reactance 0.0003000000
2018-02-16 16:53:23.550 farads 0.0003900000 2018-02-16 16:58:16.929 reactance 0.0005800000
2018-02-16 16:54:03.276 farads 0.0005300000 2018-02-16 16:58:56.961 reactance 0.0003000000


Last edited by delbroooks; 4 Weeks Ago at 06:41 PM..
Sponsored Links
    #12  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
RudiC's Unix or Linux Image
RudiC RudiC is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 21 April 2018, 11:23 AM EDT
Location: Aachen, Germany
Posts: 12,504
Thanks: 401
Thanked 3,877 Times in 3,564 Posts
For your first problem, try - based on the assumption that there's only few time stamps with duplicate seconds and norrmally large gaps in between - this simpler approach, which eliminates the need for a system call to date by adding the epoch time to every line upfront:



Code:
paste <(date +"%s" -f<(cut -d" " -f1,2 data.txt)) data.txt | awk '
$1 in LN        {$1++
                }
                {TM[NR] = $1
                 sub ($1 ".", _)
                 LN[TM[NR]] = $0
                }
END             {for (n=1; n<=NR; n++)  {TMP = TM[n] + 300
                                         DT  = 0
                                         for (SEC=0; SEC<120; SEC++)    {if ((TMP + SEC) in LN) DT = +SEC
                                                                         if ((TMP - SEC) in LN) DT = -SEC
                                                                         if (DT) break
                                                                        }
                                         OUT = LN[TMP+DT]
                                         sub  (/farads./, _, OUT)
                                         $0 = LN[TM[n]] OFS (OUT?OUT:"NA NA")
                                         print
                                        }
                }
'

Sponsored Links
    #13  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
RudiC's Unix or Linux Image
RudiC RudiC is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 21 April 2018, 11:23 AM EDT
Location: Aachen, Germany
Posts: 12,504
Thanks: 401
Thanked 3,877 Times in 3,564 Posts
For your other problem, try


Code:
paste <(date +"%s" -f<(cut -d" " -f1,2 file2)) file2 > TMP2
paste <(date +"%s" -f<(cut -d" " -f1,2 file1)) file1 > TMP1
awk '

FNR == NR       {if ($1 in LN)  $1++
                 TM[NR] = $1
                 sub ($1 ".", _)
                 LN[TM[NR]] = $0
                 next
                }

                {TMP = $1 + 300
                 DT  = 0
                 for (SEC=0; SEC<120; SEC++)    {if ((TMP + SEC) in LN) DT = +SEC
                                                 if ((TMP - SEC) in LN) DT = -SEC
                                                 if (DT) break
                                                }
                 OUT = LN[TMP+DT]
                 sub ($1 ".", _)
                 print $0  OFS (OUT?OUT:"NA NA")
                }
' TMP2 TMP1

Sponsored Links
    #14  
Old Unix and Linux 4 Weeks Ago   -   Original Discussion by delbroooks
delbroooks's Unix or Linux Image
delbroooks delbroooks is offline
Registered User
 
Join Date: Mar 2018
Last Activity: 26 March 2018, 1:26 PM EDT
Posts: 8
Thanks: 3
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by Chubler_XL View Post
Edit: previous solution could miss closer records that are before previous target this should be more accurate:



Code:
#!/usr/bin/awk -f
FNR==1 {file++}
{
  day=$1
  gsub(/-/, " ", day)
  split($2, t, ".")
  gsub(/:/, " ", t[1])
  x=mktime(day " " t[1]) + t[2] / 1000
  if(file==1) srctime[FNR]=x
  else desttime[FNR]=x
  records[file, FNR]=$0
}

END {
   offset=5*60
   max=2*60
   deststart=0
   for (rec in srctime) {
       target = srctime[rec] + offset
       offsetmin = target - max
       offsetmax = target + max
       best = 9999999
       found = 0
       cur=deststart+1
       while(cur in desttime && desttime[cur] < offsetmax) {
           if (desttime[cur] < target && desttime[cur] > offsetmin &&
               best > target - desttime[cur]) {
                  if( best = 9999999) deststart = cur
                  best= target - desttime[cur]
                  found=cur
           }
           if (desttime[cur] >= target) {
              if(best > desttime[cur] - target) {
                  best=desttime[cur] - target
                  found=cur
               }
               break
           }
           cur++
        }

        if (found)
           print records[1, rec] " " records[2, found]
        else
           print records[1, rec] " NA NA"
    }
}

This is matching the two files well. I am getting a warning that says
assignment used in conditional context



Code:
awk: awkscript5:30: (FILENAME=file2.txt FNR=175) warning: assignment used in conditional context


Last edited by Chubler_XL; 3 Weeks Ago at 06:01 PM.. Reason: Fix quotation start missing
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk gsub not working as expected pradyumnajpn10 Shell Programming and Scripting 6 11-27-2017 06:00 AM
awk command not working as expected later_troy Shell Programming and Scripting 31 04-07-2016 04:07 PM
awk not working as expected in script emily Shell Programming and Scripting 12 10-25-2014 05:04 PM
Var substitution in awk - not working as expected videsh77 Shell Programming and Scripting 3 01-13-2006 12:57 PM
awk not working as expected with BIG files ... videsh77 Shell Programming and Scripting 1 02-24-2005 03:15 PM



All times are GMT -4. The time now is 11:41 AM.