Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

06-29-2017

Registered User

29, 0

Join Date: Jul 2016

Last Activity: 14 July 2017, 9:57 AM EDT

Posts: 29

Thanks Given: 6

Thanked 0 Times in 0 Posts

Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

This is a question that is related to one I had last August when I was trying to sort/merge two files by millsecond time column (in this case column 6).

The script (below) that helped me last august by RudiC solved the puzzle of sorting/merging two files by time, except it gets lost when the day/time changes back to zero (next day).

Code:

awk '
NR == 1         {getline HD1 < F1
                 HD2 = $0
                 next
                }

$6 >= T[6]      {do     {LAST = TMP
                         ST = getline TMP < F1
                         split (TMP, T, FS)
                        }
                 while (($6 >= T[6]) && (ST == 1))
                 if (ST == 0)   {LAST = TMP
                                 T[6] = "ZZZ"
                                }
                 print HD1
                 print LAST
                 print HD2
                 print
                 next
                }
                {print 
                }

' FS="," F1=file1 file2

Recap: The output file needs to have file1 contents on top of file2 contents (headers included) where file2 column 6 is >= to file1 column 6, and file2 column 6 (same value) is < file1 column 6 (next value).

It works well until the day changes and time restarts, then it gets lost and doesn't match data properly.

Original files (snippets from very large files)
Column 6 = Bold

Code:

File1 

TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:58	1	572	11353	21719	86398214
6/19/2017 23:59:59	1	572	11354	21719	86399214
6/20/2017 0:00:00	1	572	11355	21720	214
6/20/2017 0:00:01	1	572	11356	21720	1214

File2

TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:58	1	544	6677	21719	86398318
6/19/2017 23:59:59	1	544	6678	21719	86399318
6/20/2017 0:00:00	1	544	6679	21720	318
6/20/2017 0:00:01	1	544	6680	21720	1318

I need it to do this below, so that when the day/time changes, it still needs to match column 6 properly.

You can see how the times from both files match in column 1, but in column 6 (millisecond) the preceding file1 is always less than file2 column 6.
When the day changes to 6/20, it still needs to sort properly like this.

Desired Output
File1 - green
File2 - red

Code:

TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:58	1	572	11353	21719	86398214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:58	1	544	6677	21719	86398318
TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:59	1	572	11354	21719	86399214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:59	1	544	6678	21719	86399318

TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/20/2017 0:00:00	1	572	11355	21720	214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/20/2017 0:00:00	1	544	6679	21720	318
TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/20/2017 0:00:01	1	572	11356	21720	1214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/20/2017 0:00:01	1	544	6680	21720	1318

This is so difficult for me to explain, sorry!

---------- Post updated 06-29-17 at 01:30 PM ---------- Previous update was 06-28-17 at 03:26 PM ----------

I may have found a workaround by adding column 1 in the conditional checks. That way it not only finds >= in the column 1 time, but still keeps the the lower millisecond number (column 6) with file 1 row preceding file 2 row.
There is probably a better way to compensate for the changing of the day/time back to zero, but this seems to work.

Code:

awk '
NR == 1         {getline HD1 < F1
                 HD2 = $0
                 next
                }

$6 >= T[6] && $1 >= T[1]   {do     {LAST = TMP
                         ST = getline TMP < F1
                         split (TMP, T, FS)
                        }
                 while (($6 >= T[6]) && ($1 >= T[1]) && (ST == 1))
                 if (ST == 0)   {LAST = TMP
                                 T[6] = "ZZZ"
                                }
                 print HD1
                 print LAST
                 print HD2
                 print
                 next
                }
                {print 
                }

' FS="," F1=file1 file2

aachave1

View Public Profile for aachave1

Find all posts by aachave1

UNIX for Beginners Questions & Answers

Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

Discussion started by: cmccabe

2. UNIX for Beginners Questions & Answers

Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

Discussion started by: aachave1

3. Shell Programming and Scripting

Reading and appending a row from file1 to file2 using awk or sed

Discussion started by: ida1215

4. Shell Programming and Scripting

Print sequences from file2 based on match to, AND in same order as, file1

Discussion started by: pathunkathunk

5. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

Discussion started by: pathunkathunk

6. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Discussion started by: Abhiraj Singh

7. Shell Programming and Scripting

Match part of string in file2 based on column in file1

Discussion started by: phoebus

8. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

Discussion started by: pathunkathunk

9. Shell Programming and Scripting

Match one column of file1 with that of file2

Discussion started by: polsum

10. Shell Programming and Scripting

match value from file1 in file2

Discussion started by: myguess21