Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s
# 1  
Old 06-29-2017
Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

This is a question that is related to one I had last August when I was trying to sort/merge two files by millsecond time column (in this case column 6).

The script (below) that helped me last august by RudiC solved the puzzle of sorting/merging two files by time, except it gets lost when the day/time changes back to zero (next day).

Code:
awk '
NR == 1         {getline HD1 < F1
                 HD2 = $0
                 next
                }

$6 >= T[6]      {do     {LAST = TMP
                         ST = getline TMP < F1
                         split (TMP, T, FS)
                        }
                 while (($6 >= T[6]) && (ST == 1))
                 if (ST == 0)   {LAST = TMP
                                 T[6] = "ZZZ"
                                }
                 print HD1
                 print LAST
                 print HD2
                 print
                 next
                }
                {print 
                }

' FS="," F1=file1 file2

Recap: The output file needs to have file1 contents on top of file2 contents (headers included) where file2 column 6 is >= to file1 column 6, and file2 column 6 (same value) is < file1 column 6 (next value).

It works well until the day changes and time restarts, then it gets lost and doesn't match data properly.

Original files (snippets from very large files)
Column 6 = Bold
Code:
File1 

TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:58	1	572	11353	21719	86398214
6/19/2017 23:59:59	1	572	11354	21719	86399214
6/20/2017 0:00:00	1	572	11355	21720	214
6/20/2017 0:00:01	1	572	11356	21720	1214

File2

TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:58	1	544	6677	21719	86398318
6/19/2017 23:59:59	1	544	6678	21719	86399318
6/20/2017 0:00:00	1	544	6679	21720	318
6/20/2017 0:00:01	1	544	6680	21720	1318

I need it to do this below, so that when the day/time changes, it still needs to match column 6 properly.

You can see how the times from both files match in column 1, but in column 6 (millisecond) the preceding file1 is always less than file2 column 6.
When the day changes to 6/20, it still needs to sort properly like this.

Desired Output
File1 - green
File2 - red
Code:
TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:58	1	572	11353	21719	86398214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:58	1	544	6677	21719	86398318
TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/19/2017 23:59:59	1	572	11354	21719	86399214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/19/2017 23:59:59	1	544	6678	21719	86399318

TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/20/2017 0:00:00	1	572	11355	21720	214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/20/2017 0:00:00	1	544	6679	21720	318
TIMEFORMATTED	G_CCSDS_2HDR_FLAG	G_CCSDS_APID	G_CCSDS_SEQ_COUNT	G_CCSDS_DOY	G_CCSDS_MSEC
6/20/2017 0:00:01	1	572	11356	21720	1214
TIMEFORMATTED	CCSDS_2HDR_FLAG	CCSDS_APID	CCSDS_SEQ_COUNT	CCSDS_DOY	CCSDS_MSEC
6/20/2017 0:00:01	1	544	6680	21720	1318

This is so difficult for me to explain, sorry!

---------- Post updated 06-29-17 at 01:30 PM ---------- Previous update was 06-28-17 at 03:26 PM ----------

I may have found a workaround by adding column 1 in the conditional checks. That way it not only finds >= in the column 1 time, but still keeps the the lower millisecond number (column 6) with file 1 row preceding file 2 row.
There is probably a better way to compensate for the changing of the day/time back to zero, but this seems to work.

Code:
awk '
NR == 1         {getline HD1 < F1
                 HD2 = $0
                 next
                }

$6 >= T[6] && $1 >= T[1]   {do     {LAST = TMP
                         ST = getline TMP < F1
                         split (TMP, T, FS)
                        }
                 while (($6 >= T[6]) && ($1 >= T[1]) && (ST == 1))
                 if (ST == 0)   {LAST = TMP
                                 T[6] = "ZZZ"
                                }
                 print HD1
                 print LAST
                 print HD2
                 print
                 next
                }
                {print 
                }

' FS="," F1=file1 file2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies

2. UNIX for Beginners Questions & Answers

Keep only the closet match of timestamped row (include headers) from file1 to precede file2 row/s

My original files are like this below and I distinguish them from the AP_ID (file1 has 572 and file2 has 544). Also, the header on file1 has “G_” pre-pended. NOTE: these are only snippets of very large files and much of the data is not present here. Original File 1: ... (36 Replies)
Discussion started by: aachave1
36 Replies

3. Shell Programming and Scripting

Reading and appending a row from file1 to file2 using awk or sed

Hi, I wanted to add each row of file2.txt to entire length of file1.txt given the sample data below and save it as new file. Any idea how to efficiently do it. Thank you for any help. input file file1.txt file2.txt 140 30 200006 141 32 140 32 200006 142 33 140 35 200006 142... (5 Replies)
Discussion started by: ida1215
5 Replies

4. Shell Programming and Scripting

Print sequences from file2 based on match to, AND in same order as, file1

I have a list of IDs in file1 and a list of sequences in file2. I can print sequences from file2, but I'm asking for help in printing the sequences in the same order as the IDs appear in file1. file1: EN_comp12952_c0_seq3:367-1668 ES_comp17168_c1_seq6:1-864 EN_comp13395_c3_seq14:231-1088... (5 Replies)
Discussion started by: pathunkathunk
5 Replies

5. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

I have two files. File 1 is a two-column index file, e.g. comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+) comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+) File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

6. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Hi. How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg file1 1:abc 3:ghi 6:pqr file2 a abc b def c ghi d jkl e mno f pqr ... (6 Replies)
Discussion started by: Abhiraj Singh
6 Replies

7. Shell Programming and Scripting

Match part of string in file2 based on column in file1

I have a file containing texts and indexes. I need the text between (and including ) INDEX and number "1" alone in line. I have managed this: awk '/INDEX/,/1$/{if (!/1$/)print}' file1.txt It works for all indexes. And then I have second file with years and indexes per year, one per line... (3 Replies)
Discussion started by: phoebus
3 Replies

8. UNIX for Dummies Questions & Answers

if matching strings in file1 and file2, add column from file1 to file2

I have very limited coding skills but I'm wondering if someone could help me with this. There are many threads about matching strings in two files, but I have no idea how to add a column from one file to another based on a matching string. I'm looking to match column1 in file1 to the number... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

9. Shell Programming and Scripting

Match one column of file1 with that of file2

Hi, I have file1 like this aaa ggg ddd vvv eeeand file2 aaa 2 aaa 443 xxx 76 aaa 34 ggg 33 wee 99 ggg 33 ddd 1 ddd 10 ddd 98 sds 23 (4 Replies)
Discussion started by: polsum
4 Replies

10. Shell Programming and Scripting

match value from file1 in file2

Hi, i've two files (file1, file2) i want to take value (in column1) and search in file2 if the they match print the value from file2. this is what i have so far. awk 'FILENAME=="file1"{ arr=$1 } FILENAME=="file2" {print $0} ' file1 file2 (2 Replies)
Discussion started by: myguess21
2 Replies
Login or Register to Ask a Question