Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 10-14-2010
Registered User
 
Join Date: May 2008
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
regexp: match string that contains list of chars

Hi,

I'm curious about how to do a very simple thing with regular expressions that I'm unable to figure out.

If I want to find out if a string contains 'a' AND 'b' AND 'c' it can be very easily done with grep:

Code:
echo $STRING|grep a|grep b|grep c

but, how would you do that in a single regexp?

A possible solution would be:


Code:
/a.*b.*c|a.*c.*b|b.*c.*a|b.*a.*c|c.*b.*a|c.*a.*b/

but it's so ugly it's nasty!! (imagine if instead of 3 chars we have 10!)

any gurus out there know how to do this?

cheers

Last edited by vbe; 10-14-2010 at 10:12 AM.. Reason: code tags...
Sponsored Links
    #2  
Old 10-14-2010
Scrutinizer's Avatar
Moderator
 
Join Date: Nov 2008
Location: Amsterdam
Posts: 7,350
Thanks: 144
Thanked 1,756 Times in 1,593 Posts

Code:
awk '/a/&&/b/&&/c/'

Sponsored Links
    #3  
Old 10-14-2010
Registered User
 
Join Date: May 2008
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
Scrutinizer: that's very cool, it's better than chaining up grep's But I would like to do that in a single regexp i.e. without the use of shell tools like grep, awk, sed...
    #4  
Old 10-14-2010
Registered User
 
Join Date: Oct 2010
Location: Southern NJ, USA (Nord)
Posts: 3,786
Thanks: 8
Thanked 469 Times in 449 Posts
Well, awk has not one regex but one command, which is nicer formatted, and you can use sed, which is faster, generally and because it doe not evaluate regex if the line is done=dead, and the list of patterns can be of any length, easily viewed:


Code:
sed '
  /a/!d
  /b/!d
  /c/!d
 '



---------- Post updated at 10:45 AM ---------- Previous update was at 10:42 AM ----------

If you want just regex not commands, you are probably out of luck. The searches are too unrelated for one regex. What context do you want to use it in, if not a command?
The Following User Says Thank You to DGPickett For This Useful Post:
Scrutinizer (10-14-2010)
Sponsored Links
    #5  
Old 10-14-2010
Scrutinizer's Avatar
Moderator
 
Join Date: Nov 2008
Location: Amsterdam
Posts: 7,350
Thanks: 144
Thanked 1,756 Times in 1,593 Posts
If you want to use return codes (like grep -q)

Code:
awk '/a/&&/b/&&/c/{f=1;exit}END{if(!f){exit 1}}'

Sponsored Links
    #6  
Old 10-14-2010
Registered User
 
Join Date: May 2008
Posts: 9
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by DGPickett View Post
If you want just regex not commands, you are probably out of luck. The searches are too unrelated for one regex. What context do you want to use it in, if not a command?
For example in any programming language that supports pcre: C, perl, python, Ruby... of course every programming language has other ways to check this, for example in python:

Code:
>>> s = "axbxc"
>>> 'a' in s and 'b' in s and 'c' in s
True

I just want to know if it's possible to do that in a single regular expression, just out of curiousity and simply to get a better understanding of regexps.

I tried to do it like this:


Code:
/([abc]).*([^\1]).*[^\2]/

But of course that doesn't work because [^\1] matches *all* the characters except the character that matched in the first parenthesis set... I think this should be done with some kind of backtracking.

And BTW I'm sure that it can be done with regexps! I mean, if you can test if a number is a primer number with regular expressions, I refuse to believe this simple thing can't be done
Sponsored Links
    #7  
Old 10-14-2010
Registered User
 
Join Date: Oct 2010
Location: Southern NJ, USA (Nord)
Posts: 3,786
Thanks: 8
Thanked 469 Times in 449 Posts
Well, if you dislike but must apply the three regex in sequence, you might try this simple heurism: Put the three regex in as list, and apply them in the current order; if one misses, removing a line from contention, move it to the top of the list if not already, sliding the others down. For instance, consider c e q as three regex. The q will reject more lines than c, usually, and the e less, but by letting the best rejecters float to the top, you save a lot of second and third regex searches.

I have a name for this, but it is not politically correct, something about how a dictator selects a military commander -- death at first failure. It came to me one day as a text editor took very long to find instances of 'equal': it did a character scan and for every first character, it stopped and did a sting compare, tragically missing the filtering power of q. If I searched for 'qual', it was quick (Borland Sprint on I386 dos emulation under UNIX SVR3).
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
extract string until regexp from backside elifchen Shell Programming and Scripting 8 05-26-2010 08:20 AM
exact string match ; search and print match bash_in_my_head Shell Programming and Scripting 8 05-22-2010 11:41 PM
perl regexp: no match across newlines BatManWSL Shell Programming and Scripting 2 05-17-2010 02:19 AM
regexp to get first line of string jonas.gabriel Shell Programming and Scripting 3 05-09-2007 03:46 AM
RegExp negative match not working umen Shell Programming and Scripting 2 08-22-2006 04:57 PM



All times are GMT -4. The time now is 01:07 AM.