All combinations from simple regex


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers All combinations from simple regex
# 8  
Old 05-24-2013
Quote:
Originally Posted by beca123456
Yep ! Sorry I should have been more precise in my explanations.
Another example to clarify:

input:
Code:
[ab]cd?e

output:
Code:
acde
ace
bcde
bce

You're missing the point. There are an infinite number of strings that match the regular expression [ab]cd?e. "ace" is one; "ace" preceded or followed by any number of characters also matches. If your RE was anchored on both ends, then your examples would make sense. Even if you say that you will only accept strings that are three or four characters long, there are still 4 + 4 * (number of distinct characters in your current locale's LC_CTYPE definition - 1) strings that will match your RE.
# 9  
Old 05-25-2013
@Don Cragun: Ok I see what you mean.

So to be more precise again, the regex should actually be:
Code:
/^[ab]cd?e$/

PLUS the length of the output must be between 3 and 4 (depending if "d" is included or not).

Quote:
Even if you say that you will only accept strings that are three or four characters long, there are still 4 + 4 * (number of distinct characters in your current locale's LC_CTYPE definition - 1) strings that will match your RE.
I am not sure I get this point...

I don't want to match anything actually, I just need to get all the combinations of the regex to see what it might match or not.

regex -> combinations -> match

I am just interested in the second intermediate step.

I understand it may sound weird ...Smilie
# 10  
Old 05-25-2013
There are 127 ** 4 + 127 ** 3 combinations of three or four characters for ASCII, 255 ** 4 + 255 ** 3 combinations of three or four characters for EBCDIC and ISO-8859-* code sets, and (several thousand) ** 4 + (several thousand) ** 3 combinations of three or four characters for Unicode. Of those, 4 + 4 * (number of distinct characters in your current locale's LC_CTYPE definition - 1) strings containing three or four characters will match your RE match your RE when your RE does not contain the anchors: "ace", "ace" preceded by any character except NUL, "ace" followed by any character except NUL, "bce", "bce" preceded by any character except NUL, "bce" followed by any character except NUL, "acde", and "bcde".

When your RE includes the anchors, only two out of (number of characters in your code set - 1) ** 3 three character combinations will be matched by your RE: "ace" and "bce"; and only two out of (number of characters in your code set - 1) ** 4 four character combinations will be matched by your RE: "acde" and "bcde".

In both cases the number of strings consisting of combinations of three or four characters depends on the code set underlying the locale you're using; not on the RE you use to match those strings.
These 2 Users Gave Thanks to Don Cragun For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sendmail K command regex: adding exclusion/negative lookahead to regex -a@MATCH

I'm trying to get some exclusions into our sendmail regular expression for the K command. The following configuration & regex works: LOCAL_CONFIG # Kcheckaddress regex -a@MATCH +<@+?\.++?\.(us|info|to|br|bid|cn|ru) LOCAL_RULESETS SLocal_check_mail # check address against various regex... (0 Replies)
Discussion started by: RobbieTheK
0 Replies

2. Shell Programming and Scripting

All possible combinations problem

Post #2 is the original post. This is the first answer to post #2 Hi, try: awk ' { match($0,/1+/) b=substr($0,1,RSTART-1) e=substr($0,RSTART+RLENGTH,length) for(i=2^RLENGTH-2; i>0; i--) { s=x; d=i while(d) { s=(d%2==0?0:1) s ... (12 Replies)
Discussion started by: Scrutinizer
12 Replies

3. Shell Programming and Scripting

All possible combinations

Hi, I have an input file like this a b c d I want to print all possible combinations between these records in the following way aVSb aVSc aVSd bVSc bVSd cVSd VS indicates versus. All thoughts are appreciated. (5 Replies)
Discussion started by: jacobs.smith
5 Replies

4. Shell Programming and Scripting

Combinations / Permutations

Hello Scrutinizer / Group , The shell script of awk that Scrutinizer made calculate all possible permutations in this case 3125 (5 numbers) but i want to have only the 126 possible combination. For now it does not matter the specific order of the combination numbers. I would appreciate it you... (1 Reply)
Discussion started by: csierra
1 Replies

5. Shell Programming and Scripting

help with simple regex expression

I am trying to grep the following line in a file using a bash shell: (..) admin1::14959:::::: (..) It works with the following expression (as expected) # cat file | grep ^*:: admin1::14959:::::: but it does not work with (not expected) # cat /etc/shadow | grep ^+:: I assume the... (2 Replies)
Discussion started by: schms
2 Replies

6. Shell Programming and Scripting

Help with simple RegEx on grep

Hello, I am trying to grep my log files for ORA errors, except ORA-00001. I have tried: grep 'ORA*!(-00001)' *.log but it is not working. Any help will be much appreciated. Thank you. (5 Replies)
Discussion started by: drbiloukos
5 Replies

7. Shell Programming and Scripting

Simple regex problem?

Hi all, I am looking to create words from a sentence which adhere to a custom search pattern from my website: Example: ! +! / += ~ where the terms ! = not, +! = AND NOT, += - and equals and ~ = can be like.... Now here is the issue...i want to split a sentence like the one above on... (1 Reply)
Discussion started by: muay_tb
1 Replies

8. Shell Programming and Scripting

Need Help with Simple Regex

I have got a question. How to do this? I mean AND expression in regex. List all the files in current directory that do not contain the words use AND take. Thx.:p (15 Replies)
Discussion started by: evilfreakz
15 Replies

9. Shell Programming and Scripting

A simple find and replace without using any regex (bash)

Hi, I need to do an exact find and replace (I don't want to use regular expressions because the input comes from user). I want to find a line that matches the user's input text and replace it with an empty string. For example, let's say the user enters I love "Unix" and the contents of the... (2 Replies)
Discussion started by: srikanths
2 Replies

10. UNIX for Advanced & Expert Users

Combinations

Hello All, i have two files, one of the format A 123 B 124 C 234 D 345 And the other A 678 B 789 C 689 D 567 I would like to combine them into one file with three columns: A 123 678 B 124 789 C 234 689 (4 Replies)
Discussion started by: Khoomfire
4 Replies
Login or Register to Ask a Question