Sponsored Content
Top Forums Shell Programming and Scripting awk to update value based on pattern match in another file Post 303003438 by MadeInGermany on Wednesday 13th of September 2017 04:00:33 PM
Old 09-13-2017
Have you made any progress?
Here is my attempt.
I have added another loop over the ";"-separated parts in $12.
In each part it still cycles through all characters after "p.".
Code:
awk '
  BEGIN { OFS="\t" }
  # The input files are processed one by one and the following code runs for each line
  # FNR is equal to NR when processing file1
  # a[ ] is indexed by the one letter code, its value is the three letter code
  FNR==NR { a[$1]=$2; next }
  # The next goes to the next input cycle
  # The following code runs for file2 (and further files)
  $12 ~ /:NM_/ {
    ostring=""
    # split $12 by ";" and cycle through them
    nNM=split($12,NM,";")
    for (n=1; n<=nNM; n++) {
      if (n>1) ostring=(ostring ";") # append ";"
      if (match(NM[n],/p[.].*/)) {
        # copy up to "p."
        ostring=(ostring substr(NM[n],1,RSTART+1))
        # Get the substring after "p."
        VAL=substr(NM[n],RSTART+2)
        # Get its length
        lenVAL=length(VAL)
        # Cycle through each character, append to ostring, if in a[ ] replace by its value
        for (i=1; i<=lenVAL; i++) {
          c=substr(VAL,i,1)
          ostring=(ostring ((c in a) ? a[c] : c))
        }
      } else {
        # append the unchanged string
        ostring=(ostring NM[n])   
      }
    }
    # copy ostring back to $12 (unconditionally)
    $12=ostring
  }
  # always print
  { print }
' file1 FS="\t" file2

This User Gave Thanks to MadeInGermany For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge two file data together based on specific pattern match

My input: File_1: 2000_t g1110.b1 abb.1 2001_t g1111.b1 abb.2 abb.2 g1112.b1 abb.3 2002_t . . File_2: 2000_t Ali england 135 abb.1 Zoe british 150 2001_t Ali england 305 g1111.b1 Lucy russia 126 (6 Replies)
Discussion started by: patrick87
6 Replies

2. Shell Programming and Scripting

AWK match $1 $2 pattern in file 1 to $1 $2 pattern in file2

Hi, I have 2 files that I have modified to basically match each other, however I want to determine what (if any) line in file 1 does not exist in file 2. I need to match column $1 and $2 as a single string in file1 to $1 and $2 in file2 as these two columns create a match. I'm stuck in an AWK... (9 Replies)
Discussion started by: right_coaster
9 Replies

3. Shell Programming and Scripting

Help needed - Split large file into smaller files based on pattern match

Help needed urgently please. I have a large file - a few hundred thousand lines. Sample CP START ACCOUNT 1234556 name 1 CP END ACCOUNT CP START ACCOUNT 2224444 name 1 CP END ACCOUNT CP START ACCOUNT 333344444 name 1 CP END ACCOUNT I need to split this file each time "CP START... (7 Replies)
Discussion started by: frustrated1
7 Replies

4. Shell Programming and Scripting

Help with ksh-to read ip file & append lines to another file based on pattern match

Hi, I need help with this- input.txt : L B white X Y white A B brown M Y black Read this input file and if 3rd column is "white", then add specific lines to another file insert.txt. If 3rd column is brown, add different set of lines to insert.txt, and so on. For example, the given... (6 Replies)
Discussion started by: prashob123
6 Replies

5. Shell Programming and Scripting

awk to update field file based on match

If $1 in file1 matches $2 in file2. Then the value in $2 of file2 is updated to $1"."$2 of file2. The awk seems to only match the two files but not update. Thank you :). awk awk 'NR==FNR{A ; next} $1 in A { $2 = a }1' file1 file2 file1 name version NM_000593 5 NM_001257406... (3 Replies)
Discussion started by: cmccabe
3 Replies

6. Shell Programming and Scripting

awk match to update contents of file

I am trying to match $1 in file1 with $2 in file2. If a match is found then $3 and $4 of file2 are copied to file1. Both files are tab-delimeted and I am getting a syntax error and would also like to update file1 in-place without creating a new file, but am not sure how. Thank you :). file1 ... (19 Replies)
Discussion started by: cmccabe
19 Replies

7. Shell Programming and Scripting

awk to update field in file based of match in another

I am trying to use awk to match two files that are tab-delimited. When a match is found between file1 $1 and file2 $4, $4 in file2 is updated using the $2 value in file1. If no match is found then the next line is processed. Thank you :). file1 uc001bwr.3 ADC uc001bws.3 ADC... (4 Replies)
Discussion started by: cmccabe
4 Replies

8. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

9. Shell Programming and Scripting

Perl to update field in file based of match to another file

In the perl below I am trying to set/update the value of $14 (last field) in file2, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (4 Replies)
Discussion started by: cmccabe
4 Replies

10. Shell Programming and Scripting

awk to update file based on match in 3 fields

Trying to use awk to store the value of $5 in file1 in array x. That array x is then used to search $4 of file1 to find aa match (I use x to skip the header in file1). Since $4 can have multiple strings in it seperated by a , (comma), I split them and iterate througn each split looking for a match.... (2 Replies)
Discussion started by: cmccabe
2 Replies
PERLTRAP(1)						 Perl Programmers Reference Guide					       PERLTRAP(1)

NAME
perltrap - Perl traps for the unwary DESCRIPTION
The biggest trap of all is forgetting to "use warnings" or use the -w switch; see perllexwarn and perlrun. The second biggest trap is not making your entire program runnable under "use strict". The third biggest trap is not reading the list of changes in this version of Perl; see perldelta. Awk Traps Accustomed awk users should take special note of the following: o A Perl program executes only once, not once for each input line. You can do an implicit loop with "-n" or "-p". o The English module, loaded via use English; allows you to refer to special variables (like $/) with names (like $RS), as though they were in awk; see perlvar for details. o Semicolons are required after all simple statements in Perl (except at the end of a block). Newline is not a statement delimiter. o Curly brackets are required on "if"s and "while"s. o Variables begin with "$", "@" or "%" in Perl. o Arrays index from 0. Likewise string positions in substr() and index(). o You have to decide whether your array has numeric or string indices. o Hash values do not spring into existence upon mere reference. o You have to decide whether you want to use string or numeric comparisons. o Reading an input line does not split it for you. You get to split it to an array yourself. And the split() operator has different arguments than awk's. o The current input line is normally in $_, not $0. It generally does not have the newline stripped. ($0 is the name of the program executed.) See perlvar. o $<digit> does not refer to fields--it refers to substrings matched by the last match pattern. o The print() statement does not add field and record separators unless you set $, and "$". You can set $OFS and $ORS if you're using the English module. o You must open your files before you print to them. o The range operator is "..", not comma. The comma operator works as in C. o The match operator is "=~", not "~". ("~" is the one's complement operator, as in C.) o The exponentiation operator is "**", not "^". "^" is the XOR operator, as in C. (You know, one could get the feeling that awk is basically incompatible with C.) o The concatenation operator is ".", not the null string. (Using the null string would render "/pat/ /pat/" unparsable, because the third slash would be interpreted as a division operator--the tokenizer is in fact slightly context sensitive for operators like "/", "?", and ">". And in fact, "." itself can be the beginning of a number.) o The "next", "exit", and "continue" keywords work differently. o The following variables work differently: Awk Perl ARGC scalar @ARGV (compare with $#ARGV) ARGV[0] $0 FILENAME $ARGV FNR $. - something FS (whatever you like) NF $#Fld, or some such NR $. OFMT $# OFS $, ORS $ RLENGTH length($&) RS $/ RSTART length($`) SUBSEP $; o You cannot set $RS to a pattern, only a string. o When in doubt, run the awk construct through a2p and see what it gives you. C/C++ Traps Cerebral C and C++ programmers should take note of the following: o Curly brackets are required on "if"'s and "while"'s. o You must use "elsif" rather than "else if". o The "break" and "continue" keywords from C become in Perl "last" and "next", respectively. Unlike in C, these do not work within a "do { } while" construct. See "Loop Control" in perlsyn. o The switch statement is called "given/when" and only available in perl 5.10 or newer. See "Switch Statements" in perlsyn. o Variables begin with "$", "@" or "%" in Perl. o Comments begin with "#", not "/*" or "//". Perl may interpret C/C++ comments as division operators, unterminated regular expressions or the defined-or operator. o You can't take the address of anything, although a similar operator in Perl is the backslash, which creates a reference. o "ARGV" must be capitalized. $ARGV[0] is C's "argv[1]", and "argv[0]" ends up in $0. o System calls such as link(), unlink(), rename(), etc. return nonzero for success, not 0. (system(), however, returns zero for success.) o Signal handlers deal with signal names, not numbers. Use "kill -l" to find their names on your system. Sed Traps Seasoned sed programmers should take note of the following: o A Perl program executes only once, not once for each input line. You can do an implicit loop with "-n" or "-p". o Backreferences in substitutions use "$" rather than "". o The pattern matching metacharacters "(", ")", and "|" do not have backslashes in front. o The range operator is "...", rather than comma. Shell Traps Sharp shell programmers should take note of the following: o The backtick operator does variable interpolation without regard to the presence of single quotes in the command. o The backtick operator does no translation of the return value, unlike csh. o Shells (especially csh) do several levels of substitution on each command line. Perl does substitution in only certain constructs such as double quotes, backticks, angle brackets, and search patterns. o Shells interpret scripts a little bit at a time. Perl compiles the entire program before executing it (except for "BEGIN" blocks, which execute at compile time). o The arguments are available via @ARGV, not $1, $2, etc. o The environment is not automatically made available as separate scalar variables. o The shell's "test" uses "=", "!=", "<" etc for string comparisons and "-eq", "-ne", "-lt" etc for numeric comparisons. This is the reverse of Perl, which uses "eq", "ne", "lt" for string comparisons, and "==", "!=" "<" etc for numeric comparisons. Perl Traps Practicing Perl Programmers should take note of the following: o Remember that many operations behave differently in a list context than they do in a scalar one. See perldata for details. o Avoid barewords if you can, especially all lowercase ones. You can't tell by just looking at it whether a bareword is a function or a string. By using quotes on strings and parentheses on function calls, you won't ever get them confused. o You cannot discern from mere inspection which builtins are unary operators (like chop() and chdir()) and which are list operators (like print() and unlink()). (Unless prototyped, user-defined subroutines can only be list operators, never unary ones.) See perlop and perlsub. o People have a hard time remembering that some functions default to $_, or @ARGV, or whatever, but that others which you might expect to do not. o The <FH> construct is not the name of the filehandle, it is a readline operation on that handle. The data read is assigned to $_ only if the file read is the sole condition in a while loop: while (<FH>) { } while (defined($_ = <FH>)) { }.. <FH>; # data discarded! o Remember not to use "=" when you need "=~"; these two constructs are quite different: $x = /foo/; $x =~ /foo/; o The "do {}" construct isn't a real loop that you can use loop control on. o Use "my()" for local variables whenever you can get away with it (but see perlform for where you can't). Using "local()" actually gives a local value to a global variable, which leaves you open to unforeseen side-effects of dynamic scoping. o If you localize an exported variable in a module, its exported value will not change. The local name becomes an alias to a new value but the external name is still an alias for the original. As always, if any of these are ever officially declared as bugs, they'll be fixed and removed. perl v5.18.2 2014-01-06 PERLTRAP(1)
All times are GMT -4. The time now is 09:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy