Regex within IF statement in awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Regex within IF statement in awk
# 1  
Old 05-11-2013
Regex within IF statement in awk

Hello to all,

I have:
Code:
X="string 1-"
Y="-string 2"
Z="string 1-20-string 2"

In the position of the number 20 could be different numbers, but I'm interest only when the number is 15, 20,45 or 70.

I want to include an IF within an awk code with a regex in the following way.

Code:
if(Z!~/X(15|20|45|70)Y/)

but it seems is not working. What is wrong? is possible?

The following way works:
Code:
if(Z!=X15Y && Z!=X20Y && Z!=X45Y && Z!=X70Y)

But I want to reduce the code and with a Regex would be much better.

Thanks in advance.
# 2  
Old 05-11-2013
This is an example of one way:
Code:
echo 'string 1-20-string 2' | awk -F, '{ if ($1 ~ /-(15|20|45|70)-/) print }'

# 3  
Old 05-11-2013
Quote:
Originally Posted by Ophiuchus
Code:
Z!~/X(15|20|45|70)Y/

but it seems is not working. What is wrong? is possible?
You can use an expression in place of the regular expression literal.
Code:
Z !~ X "(15|20|45|70)" Y

Be aware that the contents of any string literals used as part of a regular expression must traverse two parsers, first the string parser, then the regular expression parser. This is significant if you need to use escape sequences. To learn more, see the gawk manual: computed regular expressions (dynamic/computed regular expressions are not gawk specific, this was just a convenient link near the top of a web search).

Regards,
Alister

Last edited by alister; 05-11-2013 at 10:25 PM..
# 4  
Old 05-11-2013
The pattern within /PAT/ cannot include variables. The spacebar fixed pattern is one solution. A more general solution, exactly the way you want it, is as follows:
Code:
PAT=X "(15|20|45|70)" Y; if (Z !~ PAT)

# 5  
Old 05-12-2013
Hello,

Thank spacebar/alister/hanson for the answers.

I've tried but only the option to include the complete string literally inside the
"//" worked.

I mean, the only way that worked for me is (as spacebar solution):
Code:
if (Z !~ /string 1-(15|20|45|70)-string 2/)

I'm not sure why the other 2 solutions doesn't work. I'm using Cygwin.

The issue is that I was trying to replace "string 1-" and "-string 2" with variables because they are very long strings in the real code.

Thanks for the help.
# 6  
Old 05-12-2013
Quote:
I'm not sure why the other 2 solutions doesn't work. I'm using Cygwin.
I'm not sure why it is not working either. It "ought" to work. You can probably see the alister and hanson44 solutions are the same, the only difference being whether to assign the pattern to a separate variable.

To figure out why it is not working, post what you are doing, how you are trying to implement the solution, and what the output is. It will help if you can post more than a fragment, post something that produces output, something we can replicate. And post what version of awk is being used in your cygwin environment, which might make a difference. awk --version will probably print the version. If not, figure out what the awk version is however you can.
# 7  
Old 05-12-2013
Hello hanson44,

The awk version is:
Code:
$ awk --version
GNU Awk 4.0.1
Copyright (C) 1989, 1991-2012 Free Software Foundation.

The sample input file is:
Code:
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894315|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894334|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894320|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894302|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894391|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894345|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894320|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894345|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894370|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894315|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3

The output file should be (only lines that are different to the Regex inside the bars "//"):
Code:
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894334|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894302|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3
5|35|998367383|5|3||,7|44|783738002|3|55|JK|,97|16|333712894391|87|50|2|,8,3,32,0,1,0,1,7,8,9,2,2,3

The code that works for me is:
Code:
awk '{if($0!~/5\|35\|998367383\|5\|3\|\|,7\|44\|783738002\|3\|55\|JK\|,97\|16\|3337128943(15|20|45|70)\|87\|50\|2\|,8,3,32,0,1,0,1,7,8,9,2,2,3/)
print}' input.txt

But if I try the following 2 options, the output fails, because they print all lines and not only the 3 lines showed in the desired output:
#1) This option doesn't give desired output:
Code:
awk '
BEGIN {
X="5\|35\|998367383\|5\|3\|\|,7\|44\|783738002\|3\|55\|JK\|,97\|16\|3337128943"
Y="\|87\|50\|2\|,8,3,32,0,1,0,1,7,8,9,2,2,3"
Z=X "(15|20|45|70)" Y
}
{if($0 !~ Z); print}' input.txt

#2) This option doesn't give desired output:
Code:
awk '
BEGIN {
X="5\|35\|998367383\|5\|3\|\|,7\|44\|783738002\|3\|55\|JK\|,97\|16\|3337128943"
Y="\|87\|50\|2\|,8,3,32,0,1,0,1,7,8,9,2,2,3"
Z1=X"15"Y
Z2=X"20"Y
Z3=X"45"Y
Z4=X"70"Y
}
{if($0!~Z1 && $0!~Z2 && $0!~Z3 && $0!~Z4); print}' input.txt

Thanks in advance for the help so far.

PD: I know that is not needed to use $0, because If I use FS="|", the option would be $15 !~/3337128943(15|20|45|70)/ . But this is only an example, because
in my real code each line that here in the sample input is "$0", in my real code are variables containing those strings. But for the purpose of my question the
behaviour is the same, only works when I put the string literally inside the bars "//".

Regards

Last edited by Ophiuchus; 05-12-2013 at 05:09 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert Update statement into Insert statement in UNIX using awk, sed....

Hi folks, I have a scenario to convert the update statements into insert statements using shell script (awk, sed...) or in database using regex. I have a bunch of update statements with all columns in a file which I need to convert into insert statements. UPDATE TABLE_A SET COL1=1 WHERE... (0 Replies)
Discussion started by: dev123
0 Replies

2. Shell Programming and Scripting

Perl - what does this statement mean -Regex

push @MACARRAY, "$+{catalog} $+{machine}\n" if ($info =~ /(?<catalog>catalog).+?(?<machine>\*+)/ms); I am (still) trying to solve problem. Looking around on the server I found this piece of code. Specifically what does "$+{catalog} $+{machine}\n" do ? Thanks in advance (1 Reply)
Discussion started by: popeye
1 Replies

3. Shell Programming and Scripting

If statement with [[ ]] and regex not working as expected

Using BASH: $ if -- ::00" ]]; then echo "true"; else echo "false"; fi false Mike (5 Replies)
Discussion started by: Michael Stora
5 Replies

4. Shell Programming and Scripting

Regex escape special character in AWK if statement

I am having issues escaping special characters in my AWK script as follows: for id in `cat file` do grep $id in file2 | awk '\ BEGIN {var=""} \ { if ( /stringwith+'|'+'50'chars/ ) { echo "do this" } else if ( /anotherString/ ) { echo "do that" } else { ... (4 Replies)
Discussion started by: purebc
4 Replies

5. Shell Programming and Scripting

awk equivalent of regex

Hi all, Can someone tell me what's the (g)awk equal of this simple regex to find ip addresses in urls: egrep "^http://{1,3}\.{1,3}\.{1,3}\.{1,3}(:{1,5})?/"Input: http://10.0.0.1/query.exe http://11y10x09w:80/howaboutme http://192.168.100.190:1234/takeme.gpg Output:... (8 Replies)
Discussion started by: r4v3n
8 Replies

6. UNIX for Dummies Questions & Answers

Using AWK and regex

Hi can you suggest in this regard The sample.txt conatins the data name lines type sam 12 txt sam 24 xls sam 36 pdf ram 32 txt ram 45 sxls ram 58 word sam 92 jpeg sam 21 gif sam 22 ltf from the data i need to sum all line... (5 Replies)
Discussion started by: krashraj
5 Replies

7. Shell Programming and Scripting

awk regex problem

hi everyone suppose my input file is ABC-12345 ABCD-12345 BCD-123456 i want to search the specific pattern which looks like - in a file so i used this command cat $file | awk ' { if ($0 ~ /-/) { print } }' so it gives me the result as ABCD-12345 BCD-12345 BCD-12345 ... (31 Replies)
Discussion started by: aishsimplesweet
31 Replies

8. Shell Programming and Scripting

[BASH] recognise new line regex in case statement

Hi, I'm trying to write a routine to parse a file that contains data that will be read into arrays. The file is composed of labels to identify data types and arbitrary lines of data with the usual remarks and empty new lines as is common with config files. The initial pass is built as so:... (3 Replies)
Discussion started by: ASGR
3 Replies

9. Shell Programming and Scripting

awk or regex

Hi! I want to made a program that will generate code like this: {{Navedi XYZ |avtor=XYZ1 |naslov=XYZ2 |leto_izzida=XYZ3 |zalozba=XYZ4 |kraj=XYZ5 |isbn=XYZ6 |cobiss_id=XYZ7 }} from input like this: <b> ODGOVORNOST............. : <a... (5 Replies)
Discussion started by: smihael
5 Replies

10. UNIX for Dummies Questions & Answers

Regex in if-then-else statement to match strings

hello I want to do a pattern match for string in the if statement, but I am not sure how to use regex inside the if statement. I am looking for something like this: if {2,3} ]; then ..... .... ... fi (7 Replies)
Discussion started by: rakeshou
7 Replies
Login or Register to Ask a Question