awk match to update contents of file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk match to update contents of file
# 8  
Old 09-07-2016
Quote:
Originally Posted by cmccabe
Thank you both very much Smilie.

---------- Post updated at 10:00 AM ---------- Previous update was at 09:08 AM ----------

Since it is possible for the value in $1 to be non-unique I added o[i[++ic] = $1,$2,$3] =$1 multiple fields to the array, but the file is empty. Thank you Smilie.

file1
Code:
123     1     2
456     a     b
456     x     y
789     x     y

file2
Code:
456     x     y     z     1
789     x     y     z     2

awk
Code:
awk '
BEGIN {FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
o[i[++ic] = $1,$2,$3] =$1        
}
{if($2 in o)
o[$2] = $1 OFS $2 OFS $3 OFS $4
}
END {for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' file1 file2

desired result
Code:
123     1     2
456     a     b
456     x     y     z
789     x     y     z

what do you think is happening here?

Code:
o[i[++ic] = $1,$2,$3] =$1

This User Gave Thanks to vgersh99 For This Post:
# 9  
Old 09-07-2016
Quote:
what do you think is happening here?
Code:
o[i[++ic] = $1,$2,$3] =$1

$1,$2,$3 are being stored in array o, I think? Thank you Smilie

or should that be
Code:
o[i[++ic] = $1,$2,$3] ={[$1,$2,$3]}

.
# 10  
Old 09-07-2016
Quote:
Originally Posted by cmccabe
Code:
o[i[++ic] = $1,$2,$3] =$1

$1,$2,$3 are being stored in array o, I think? Thank you Smilie
or should that be
Code:
o[i[++ic] = $1,$2,$3] ={[$1,$2,$3]}

.
Hello cmccabe,

You could split the arrays like as follows to make it easier.
o[i[++ic] = $1,$2,$3] =$1 could be written as:
Code:
i[++ic] = $1,$2,$3   
o[i[ic]]=$1

Following is the explanation too for o[i[++ic] = $1,$2,$3] =$1(Only for understanding I have split it into 2 steps).
Code:
i[++ic] = $1,$2,$3       
array named i whose value is increasing number of variable named i and it's value is $1,$2,$3.
o[i[++ic] = $1,$2,$3] =$1
array named o whose index is value of array i (whose index is increasing value if variable named ic(each time it will be increased by 1.)) and it's value is the value of field 1st.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 11  
Old 09-07-2016
Thanks you, but I am getting a syntax error... do I need to enclose the array in {}? Thank you Smilie.

Code:
awk '
BEGIN {    FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
i[++ic] = $1,$2,$3
o[i[++ic] = $1,$2,$3] =$1
}
{    if($2 in o)
o[$2] = $1 OFS $2 OFS $4 OFS $5 OFS $50 OFS $51 OFS $52 OFS $53
}
END {    for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' match.txt M28189_val
awk: cmd. line:8: i[++ic] = $1,$2,$3
awk: cmd. line:8:             ^ syntax error

# 12  
Old 09-07-2016
Hi cmccabe,
Let us go back to post #1 in this thread where you said that you were matching field #1 in file1 with field #2 in file2. Then let us go to post #6 where your new data has all numeric values in field #1 in file1 and all alphabetic values in field #2 in file2. Do you think that part of your problem might be that there are no matching fields?

In my code, the indexes in the array i[] are input line numbers and the values assigned to elements in array i[] are the keys (the field #1 value in file1 which will also be the field #2 value in matching lines in file2) used to lookup values in the array o[]. Since there are no cases in any of your sample input where field #2 in file2 matches fields 1, 2, AND 3 in file1; you get an empty output line for each line in the starting contents of file1.

Given that the format of your input files no longer matches the format described in post #1 in this thread, I have no idea what you are trying to do with the new index values you are using. Maybe if you go back to the start and describe the contents of both input files, the keys used to match lines in the two files, and the fields to be included in the updated file1 for lines that have no matching line in file2 and the fields to be included in the updated file1 for lines that do have a matching line in file2.

If keys in file1 are not unique and you do not want identical output for those lines that have a matching key in file2, you need to clearly describe what output is supposed to be produced for each line in file1 with an identical key.

And, for the record, the standard way to assign a multi-dimensional index value to an array element would be:
Code:
o[i[++ic] = $1 SUBSEP $2 SUBSEP $3] = "whatever you want to print if no match is found for the 3 fields in file2 that match the 1st 3 fields of file1"

This User Gave Thanks to Don Cragun For This Post:
# 13  
Old 09-07-2016
Here are a few lines of the tab-delimited input that I use (the actual files are several million lines).

file1
Code:
Match:
68521889    C    T
167099158    A    G
18122506    G    A

Basically, I just am trying to use $1,$2, and $3 as the unique key, and use that to lookup in file2, also tab-delimeted. I apologize for the confusion and appreciate the help. Since my files are rather large I was trying to be brief, but I can see that's no help Smilie

awk
Code:
awk '
BEGIN {    FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
o[i[++ic] = $1 FS $2 FS $3] =$1
}
{    if($2 in o)
o[$2] = $1 OFS $2 OFS $4 OFS $5 OFS $50 OFS $51 OFS $52 OFS $53
}
END {    for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' file1 file2

The awk runs but the result is just $1 of file1
# 14  
Old 09-07-2016
Quote:
Originally Posted by cmccabe
Thanks you, but I am getting a syntax error... do I need to enclose the array in {}? Thank you Smilie.
Code:
awk '
BEGIN {    FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
i[++ic] = $1,$2,$3
o[i[++ic] = $1,$2,$3] =$1
}
{    if($2 in o)
o[$2] = $1 OFS $2 OFS $4 OFS $5 OFS $50 OFS $51 OFS $52 OFS $53
}
END {    for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' match.txt M28189_val
awk: cmd. line:8: i[++ic] = $1,$2,$3
awk: cmd. line:8:             ^ syntax error

Hello cmccabe,

Off course it will give an error because it is allowed to seperate variables inside the array's index not while defining their values(though I am not at all sure what you are trying to do here), so in case you want to save an array's value like above then you could do following like:
Code:
i[++ic] = $1","$2","$3

I would like to request you here, please rephrase your requirement here as it is not clear at all this time what you are trying to do(for me at least), Solutions were given for your previous reuirement, so now if you have a different requirement then please rephrase it with sample Input_file and all rules with expected output too, I hope this helps you.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to update file based on match in 3 fields

Trying to use awk to store the value of $5 in file1 in array x. That array x is then used to search $4 of file1 to find aa match (I use x to skip the header in file1). Since $4 can have multiple strings in it seperated by a , (comma), I split them and iterate througn each split looking for a match.... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

awk to update value based on pattern match in another file

In the awk, thanks you @RavinderSingh13, for the help in below, hopefully it is close as I am trying to update the value in $12 of the tab-delimeted file2 with the matching value in $1 of the space delimeted file1. I have added comments for each line as well. Thank you :). awk awk '$12 ==... (10 Replies)
Discussion started by: cmccabe
10 Replies

3. Shell Programming and Scripting

awk to update value in field of out file using contents of another Ask

In the out.txt below I am trying to use awk to update the contents of $9.. If $9 contains a + or - then $8 of out.txt is used as a key to lookup in $2 of file. When a match ( there will always be one) is found the $3 value of that file is used to update $9 of out.txt separated by a :. So the... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

awk to update specific value in file with match and add +1 to specific digit

I am trying to use awk to match the NM_ in file with $1 of id which is tab-delimited. The NM_ will always be in the line of file that starts with > and be after the second _. When there is a match between each NM_ and id, then the value of $2 in id is substituted or used to update the NM_. Each NM_... (3 Replies)
Discussion started by: cmccabe
3 Replies

6. Shell Programming and Scripting

awk to update field in file based of match in another

I am trying to use awk to match two files that are tab-delimited. When a match is found between file1 $1 and file2 $4, $4 in file2 is updated using the $2 value in file1. If no match is found then the next line is processed. Thank you :). file1 uc001bwr.3 ADC uc001bws.3 ADC... (4 Replies)
Discussion started by: cmccabe
4 Replies

7. Shell Programming and Scripting

awk to update field file based on match

If $1 in file1 matches $2 in file2. Then the value in $2 of file2 is updated to $1"."$2 of file2. The awk seems to only match the two files but not update. Thank you :). awk awk 'NR==FNR{A ; next} $1 in A { $2 = a }1' file1 file2 file1 name version NM_000593 5 NM_001257406... (3 Replies)
Discussion started by: cmccabe
3 Replies

8. Shell Programming and Scripting

[Solved] Lookup a file and match the contents

Hi, I appreciate all who have been very helpful to me in providing valuable suggestions and replies. I want to write a script to look up a file and match the contents. Let me go through the scenario. Lets say i have two files Content file: abc, bcd, adh|bcdf|adh|wed bcf, cdf,... (2 Replies)
Discussion started by: forums123456
2 Replies

9. Shell Programming and Scripting

update file contents using shell script

Hi, I am having a file which contains as below Names(aaaa ,bbbb ,cccc ,dddd) now i want the file to be updated with new value 'eeee' as below Names(aaaa ,bbbb ,cccc ,dddd ,eeee) Is there a way to script this ? Thanks, (5 Replies)
Discussion started by: drams
5 Replies

10. Shell Programming and Scripting

How to update the contents in a file conditionally?

Hi All, I have a data file which has two columns Location and the Count. The file looks like this India 1 US 0 UK 2 China 0 What I have to do is whenever I fails to login to Oracle then I have to add 1 to the count for that location. Whenever my script fails to login to Oracle for a... (5 Replies)
Discussion started by: rajus19
5 Replies
Login or Register to Ask a Question