Regex learning.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Regex learning.
# 1  
Old 02-12-2020
Regex learning.

Hello All,

I have come across a question from colleague about complex regex, so I written a regex using grep's -P option in PCRE regex. Since its a new learning for me, so thought to share with forums.

Lets say we have a Input_file with following test data:

Code:
cat Input_file
PROJECT = 1.1.1.1
Project = 1.1.1.1.1.1.1.1
PROJECT = "1.1.1.1.1"
ProJEct = '1.1'

Now conditions here are first keyword project is fixed but could be in any case, then versions side is the main thing which we need to get as an output. In versions apart from first major version all can have alphabets also.

So I have come up with:

Code:
grep -ioP 'project\D+\K(\d+\.([\d,a-z,A-Z]+\.){1,}[\d,a-z,A-Z]+|\d+\.[\d,a-z,A-Z]+|\d+)'   Input_file

Explanation of above code:

-i: means ignore case for grep which will help us to match any kind of Project string in lines.
-o: means give only exact match of the line.
-P: means it enables PCRE regex suite for grep, which has all kind of regex mechanism in it.

Now coming to main code part:

project\D+: Look for string project(in any case) till all NON digits value(\D denotes it).
\K: means forget all previous matches this is a GREAT feature of grep and I LOVED it Smilie
d+\.([\d,a-z,A-Z]+\.){1,}[\d,a-z,A-Z]+|\d+\.[\d,a-z,A-Z]+|\d+: Here I am matching digits OR digits with alphabets with one or more occurences and only digits too for all lines, to cover all kind of cases.

Since after \k( denotes the match which should be printed so it will print only matched part in lines.


I am still learning PCRE regex, any suggestions, improvements are super allowed Smilie
Cheers and Happy learning.


Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 2  
Old 02-12-2020
Hi
It's a bit redundant
option -i applies to the whole template
Code:
grep -ioP 'project\D+\K(\d+\.([\d,a-z]+\.){1,}[\d,a-z]+|\d+\.[\d,a-z]+|\d+)'

if you want to limit it is better so
Code:
grep -oP '(?i:project)'

This User Gave Thanks to nezabudka For This Post:
# 3  
Old 02-12-2020
Quote:
Originally Posted by nezabudka
Hi
It's a bit redundant
option -i applies to the whole template
Code:
grep -ioP 'project\D+\K(\d+\.([\d,a-z]+\.){1,}[\d,a-z]+|\d+\.[\d,a-z]+|\d+)'

if you want to limit it is better so
Code:
grep -oP '(?i:project)'

Hello Nez,

Cool; thanks for letting your views , but IMHO why I added that checks in case version is in some other format then it shouldn't have false positive in output.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 4  
Old 02-12-2020
here is opposites \K
Code:
grep -ioP 'project\D+(?=\d+\.([\d,a-z]+\.){1,}[\d,a-z]+|\d+\.[\d,a-z]+|\d+)'

maybe means forget all follows matches?

--- Post updated at 18:18 ---

exactly means forget THIS match #positive lookahead
I'm sorry, carried away

Last edited by nezabudka; 02-12-2020 at 10:49 AM..
This User Gave Thanks to nezabudka For This Post:
# 5  
Old 02-28-2020
Not sure if this would be useful for you, but I found this tool a while back and it comes in handy when having to deal with regular expressions:

Expresso Regular Expression Tool
# 6  
Old 03-03-2020
Hello All,

Learnt an example of Lazy match in Regex in Perl, so thought to share here.
Let's say following is Input_file.

Code:
cat Input_file
abcdtest123^ DUMMYtestabcd12234 DUMMY bla blabla12231311313blabla bla.....,,,,,bla
test132131 ^ DUMMY blabla1213 121313_ 131y7351eg1eub wdfwfknfidh28e7ty;;;

Now we would like to have data between first occurrence of ^ to DUMMY , then we could use Lazy match like as follows:

Code:
perl -pe 's|(\^.*?DUMMY\s+)(.*)| new_text_here.... \2|'  Input_file

Output will be as follows for mentioned sample:
Code:
abcdtest123 new_text_here.... bla blabla12231311313blabla bla.....,,,,,bla
test132131  new_text_here.... blabla1213 121313_ 131y7351eg1eub wdfwfknfidh28e7ty;;;

Why is Lazy match Good here? Because .* is a GREEDY match and matches anything till last occurrence of any mentioned character etc but using Lazy match .*?DUMMY\s+ it matches very first occurrence of string DUMMY followed with space starting from ^

Tested and written this in PERL, thought/views/improvements are most welcome here.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 7  
Old 03-03-2020
answered not in the case Smilie

Last edited by nezabudka; 03-03-2020 at 04:17 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sendmail K command regex: adding exclusion/negative lookahead to regex -a@MATCH

I'm trying to get some exclusions into our sendmail regular expression for the K command. The following configuration & regex works: LOCAL_CONFIG # Kcheckaddress regex -a@MATCH +<@+?\.++?\.(us|info|to|br|bid|cn|ru) LOCAL_RULESETS SLocal_check_mail # check address against various regex... (0 Replies)
Discussion started by: RobbieTheK
0 Replies

2. Shell Programming and Scripting

Perl, RegEx - Help me to understand the regex!

I am not a big expert in regex and have just little understanding of that language. Could you help me to understand the regular Perl expression: ^(?!if\b|else\b|while\b|)(?:+?\s+){1,6}(+\s*)\(*\) *?(?:^*;?+){0,10}\{ ------ This is regex to select functions from a C/C++ source and defined in... (2 Replies)
Discussion started by: alex_5161
2 Replies

3. Programming

Learning C with CBT?

Hello All, I am a beginner in C, although I believe my learning curve is rather steep. I would like to know if any of you would have some recommendations with some learning materials to become a good C developer. I am particularly thinking about some CBT courses or some good Quizz that would... (2 Replies)
Discussion started by: freddie50
2 Replies

4. UNIX for Dummies Questions & Answers

read regex from ID file, print regex and line below from source file

I have a file of protein sequences with headers (my source file). Based on a list of IDs (which are included in some of the headers), I'd like to print out only the specified sequences, with only the ID as header. In other words, I'd like to search source.txt for the terms in IDs.txt, and print... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

5. UNIX for Dummies Questions & Answers

Need advice on learning RegEx

Hello Unix graduates, I have gone through 50 posts here in this Unix.com, out of which 35-40 are answered with regex.. Regex is always been a problem for me. :wall: I have gone through few URLs still it doesn't help me to reach what you guys are.. :confused: How to get skilled on... (1 Reply)
Discussion started by: sathyaonnuix
1 Replies

6. Shell Programming and Scripting

Learning Perl

Folks! Anyone please explain the behavior of this program step by step. Thanks. #! /usr/bin/perl $testfile = "./testfile2"; for ( $i = 1, $i <= 5, $i++) { open ($FILE, ">", $testfile); print ($FILE "Output 1 \n"); close ($FILE); } print "The value of (4 * 2) / 2 is "; print (4 * 2)... (1 Reply)
Discussion started by: huko99
1 Replies

7. Shell Programming and Scripting

Converting perl regex to sed regex

I am having trouble parsing rpm filenames in a shell script.. I found a snippet of perl code that will perform the task but I really don't have time to rewrite the entire script in perl. I cannot for the life of me convert this code into something sed-friendly: if ($rpm =~ /(*)-(*)-(*)\.(.*)/)... (1 Reply)
Discussion started by: suntzu
1 Replies

8. UNIX for Dummies Questions & Answers

learning on my own

can i do this? i am learning this on my own..and from the book..simple unix i am not sure if the syntax would work if statement then statement do or for or while statement done else statement fi.... I dont know how else to explain that...I hope I... (2 Replies)
Discussion started by: jonas27
2 Replies
Login or Register to Ask a Question