Match and Grep the nearest value in last field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match and Grep the nearest value in last field
# 1  
Old 06-22-2015
Match and Grep the nearest value in last field

Gents

I have this input file
file1 (uniq records)
Code:
54503207851 170211240
54503207911 170210837
54503208111 170215105
54503208112 170215210
54655210011 170223140
54655210091 170223738
54655210172 170224355
54655210251 170224741
54655210331 170225039
54655210411 170225505
54655210492 170230224
54655210571 170230632
54655210651 170231004
54655210731 170231601

Code:
file2 (duplicate records)
54503207851  1     1         11       2600      54503207851 170211240
54503207911  1     1          7       2600      54503207911 170210837
54503208111  1     7    1301482       2600      54503208111 170215105
54503208112  2     1          8       2600      54503208112 170215210
54655210011  1     1          1       2597      54655210011 170223140
54655210091  1     1          4       2597      54655210091 170223738
54655210172  2     1          2       2597      54655210172 170224355
54655210251  1     1          2       2597      54655210251 170224748 
54655210251  1     1          2       2597      54655210251 170224742
54655210331  1     1          1       2597      54655210331 170225040
54655210411  1     1          1       2597      54655210411 170225505
54655210492  2     1         46       2597      54655210492 170230024 
54655210492  2     1          1       2597      54655210492 170230225
54655210571  1     1          3       2597      54655210571 170230632
54655210651  1     1          1       2597      54655210651 170231004
54655210731  1     1          2       2597      54655210731 170231601

desired output
Code:
54503207851 170211240 1     1         11       2600      54503207851 170211240
54503207911 170210837 1     1          7       2600      54503207911 170210837
54503208111 170215105 1     7    1301482       2600      54503208111 170215105
54503208112 170215210 2     1          8       2600      54503208112 170215210
54655210011 170223140 1     1          1       2597      54655210011 170223140
54655210091 170223738 1     1          4       2597      54655210091 170223738
54655210172 170224355 2     1          2       2597      54655210172 170224355
54655210251 170224741 1     1          2       2597      54655210251 170224742
54655210331 170225039 1     1          1       2597      54655210331 170225040
54655210411 170225505 1     1          1       2597      54655210411 170225505
54655210492 170230224 2     1          1       2597      54655210492 170230225
54655210571 170230632 1     1          3       2597      54655210571 170230632
54655210651 170231004 1     1          1       2597      54655210651 170231004
54655210731 170231601 1     1          2       2597      54655210731 170231601

i am using the following code to get the output
Code:
awk '{x=substr($0,1,22)} FNR==NR{a[substr($0,1,11)]=$0;next} {print x, a[substr($0,1,11)]?  OFS a[substr($0,1,11)] : OFS "9999" }' file2 file1

The point is to match the column 1 in both files but to grep from file2 the nearest value comparing columns 2 in file1 and 7 in file2..

With my code I got any of them,, the purpose is to get the close value matching column 1 and comparing last column.

Thanks for your help
# 2  
Old 06-22-2015
Nearest to what? Sorry to be that thick.
# 3  
Old 06-22-2015
Hi RudiC
I say close value to column 2 in file1 comparing with value column 7 in file2 the last 6 dig correspond to time, Then as the duplicate value in column 1 file2 has 2 different time i would like to Keep the time which is more close to time in file 1.

Hope u can help me.
# 4  
Old 06-22-2015
You could try this (assumption is that file2 is sorted, as in your demo data):

Code:
awk '
FNR==NR{a[$1]=$2;next}
key && key != $1 { best=key=x; print ln }
$1 in a {
  if(a[$1]<$7) diff=$7-a[$1]
     else diff=a[$1]-$7
  if(!diff) {
     print
     key=best=x
     delete a[$1]
  } else if(!best || diff<best) {
     key=$1
     best=diff
     ln=$0
  }
}
END { if(key) print ln } ' file1 file2


Last edited by Chubler_XL; 06-22-2015 at 06:49 PM.. Reason: Edit: Ensure exact match trumps closest other record
This User Gave Thanks to Chubler_XL For This Post:
# 5  
Old 06-23-2015
Hi Xl
Thanks for your answer I will check it and I let you know

---------- Post updated at 08:28 AM ---------- Previous update was at 03:04 AM ----------

Hi XL

it works perfect, thanks a lot

but there is something that is not correct, per example if in the file2 there is 2 duplicate values and one of them have the same time of file1, so the script should keep the time like file1..

other thing please, how to ajust the code if we have also duplicate values in file1 column 1..

Code:
54503207851 170211240
54503207911 170210837
54503208111 170215105
54503208112 170215210
54655210011 170223140
54655210091 170223738
54655210172 170224355
54655210251 170224741
54655210251 170224741
54655210331 170225039
54655210411 170225505
54655210492 170230224
54655210492 170230224
54655210571 170230632
54655210651 170231004
54655210731 170231601

,

Then the time from file2 should adapt to the 2 duplicate values in file1.

Regards...

Last edited by jiam912; 06-23-2015 at 11:26 AM.. Reason: some corrections
# 6  
Old 06-23-2015
Your desired output from post#1 is matched neither by your own code snippet's result nor by Chubler_XL's. To introduce file1's second column as column 2 in the output, try
Code:
awk '
FNR==NR         {a[$1]=$2
                 next
                }
key != $1 &&
key             {best=key=x
                 print ln
                }

$1 in a         {T=$1
                 if ((diff=$7-a[T]) < 0) diff=-diff
                 sub ($1, $1 OFS a[$1])

                 if (!diff)     {print
                                 key=best=x
#                                       delete a[$1]
                                }
                 else if(!best || diff<best)    {key=$1
                                                 best=diff
                                                 ln=$0
                                                }
                }
END             {if (key) print ln }
' file1 file2
54503207851 170211240  1     1         11       2600      54503207851 170211240
54503207911 170210837  1     1          7       2600      54503207911 170210837
54503208111 170215105  1     7    1301482       2600      54503208111 170215105
54503208112 170215210  2     1          8       2600      54503208112 170215210
54655210011 170223140  1     1          1       2597      54655210011 170223140
54655210091 170223738  1     1          4       2597      54655210091 170223738
54655210172 170224355  2     1          2       2597      54655210172 170224355
54655210251 170224741  1     1          2       2597      54655210251 170224742
54655210331 170225039  1     1          1       2597      54655210331 170225040

(Chubler_XL's code adapted)

To your last question: If in file1 there's exact duplicates, nothing will happen except that a[$1] will be overwritten with the identical value. If the values are not identical, the last one will apply.
# 7  
Old 06-23-2015
Dear Rudi..

Thanks a lot

Just i was going to say something else Smilie

I get something like this:
file1
Code:
54655210492 170230224

file2
Code:
54655210492  2     1          1       2597 54655210492170230225  170230225
54655210492  2     1         46       2597 54655210492170230024  170230024

output
Code:
54655210492  2     1          1       2597 54655210492170230225  170230225

output desired
Code:
54655210492  2     1         46       2597 54655210492170230024  170230024

I will try your script, and check if give me the same result..

Please try to help me

---------- Post updated at 09:56 AM ---------- Previous update was at 09:47 AM ----------

Dear RudiC,

Just I try your code and it gives the same output. Smilie

Please try to help me to solve it..

As I say previously it needs to grep the rows where the time is exactly or the more closets.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Command/script to match a field and print the next field of each line in a file.

Hello, I have a text file in the below format: Source Destination State Lag Status CQA02W2K12pl:D:\CAQA ... (10 Replies)
Discussion started by: pocodot
10 Replies

3. Shell Programming and Scripting

Selecting nearest pattern match

I'm looking to match an error code against a list of possible codes and get the nearest match. The code would be a 6 character hexadecimal string. I have a file of error codes all of which have a specific first 3 characters, however, after that the last 3 characters may be specific or generic as... (3 Replies)
Discussion started by: dazedandconfuse
3 Replies

4. Shell Programming and Scripting

Grep the nearest value from a csv file to another

Hello, I have two csv file. The first one (a huge file) with 3 columns X, Y, Z The second one with 2 columns A, B All the values are numbers. I need to parse every "Z" field of the first file, find the closest "B" value in the second and extract from it the "A" relative value, to... (5 Replies)
Discussion started by: rjval
5 Replies

5. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

First, thanks for the help in previous posts... couldn't have gotten where I am now without it! So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following: If $1... (4 Replies)
Discussion started by: right_coaster
4 Replies

6. Shell Programming and Scripting

AWK Match to nearest number

Hello Guys, I'm very new on here and require some help matching up and printing some columns using awk. I have two text files. The first file has Longitude data in column 1 (lon.txt) and the second one (node.txt) has again another Longitude data in column 1 (not exact as the first one) + in... (7 Replies)
Discussion started by: ian_gooch
7 Replies

7. Shell Programming and Scripting

Need help to grep for a title match and then make some queries after the match

Here is the sample of my file address.txt Address 1 1234 Drive way New Orleans, LA Zipcode :- 12345 Address 2 4567 Spring way Chicago, IL Zipcode :- 67890 I would like to grep for an Address title (Ex :- Address 2) , then get its zipcode and echo both in a single line. Ex :- ... (3 Replies)
Discussion started by: leo.maveriick
3 Replies

8. UNIX for Advanced & Expert Users

sed match closest/nearest pattern

All i am struggling to raplace some text in a line between two (closest) patterns , line="/home/usr/bin/:/home/usr/devuser,n1.9/bin:/home/usr/root/bin" i want to replace "devuser,n1.9" with "NEWVAL", basically all teh text from "devuser" until nearest '/' with some new text. i tried teh... (1 Reply)
Discussion started by: sudheer1984
1 Replies

9. UNIX for Dummies Questions & Answers

match nearest

Hi, I'm trying to find the nearest match between two columns of numbers, e.g. 1,1 10,8 30,50 20,100 and the search could be e.g. 20,20 returning 10,8 - i.e. 20-10 = 10 and 20-8 = 12 totalling 22, and hence being the nearest match. any ideas? thanks a lot, (1 Reply)
Discussion started by: bogu0001
1 Replies

10. Shell Programming and Scripting

Urgent help on grep the nearest value

Hi, I have a list of data in text files with :- ColA ColB ColC ColD ColE ColF 12 122 2280 30 0.012987 1.776 13 118 2280 30 0.012987 1.954 14 115 2620 30 0.011321 2.131 15 113 2654 30 0.011177 2.309 16 145 1882 25 0.013110 2.487 17 112... (2 Replies)
Discussion started by: ahjiefreak
2 Replies
Login or Register to Ask a Question