q with Perl Regex


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting q with Perl Regex
# 1  
Old 07-24-2008
q with Perl Regex

For a programming exercise, I am mean to design a Perl script that detects double letters in a text file.

I tried the following expressions

Code:
# Check for any double letter within the alphabet

/[a-zA-Z]+/

# Check for any repetition of an alphanumeric character

/\w+/

Im aware that the + means to search for one or more occurences of that character, however trying both of these did not meet the requirements of my program.

Also

Code:
/[a-zA-Z]{1}/

did not prove to be helpful as well

After doing some searching, I stumbled across the correct form of the regex for the double letter case. It turned out to be

Code:
/(.)\1/

Now I know that . refers to any single character and the \1 refers to the first character in the line being read (if s/..../.... is being used), but Im still puzzled as to why /(.)\1/ works instead of /[a-zA-Z]+/ for the case of double letters ?

many thanks
James
# 2  
Old 07-24-2008
Quote:
Originally Posted by JamesGoh
Now I know that . refers to any single character and the \1 refers to the first character in the line being read (if s/..../.... is being used), but Im still puzzled as to why /(.)\1/ works instead of /[a-zA-Z]+/ for the case of double letters ?
* Incorrect text removed *

/[a-zA-Z]+/ only means matching a contiguous sequence of letters, so not only 'AA' or 'zz' will match, 'Az' will match too.

Last edited by cbkihong; 07-24-2008 at 02:28 AM.. Reason: Incorrect text removed
# 3  
Old 07-24-2008
\1 is a backreference to what is matched in the parenthesis in the regexp. So /(.)\1/ finds a double occurance of whatever (.) matched. It is similar to $1 but is used inside the regexp. It is discussed in some detail here:

perlretut - perldoc.perl.org
# 4  
Old 07-24-2008
Quote:
Originally Posted by cbkihong
Actually, not even /(.)\1/ is correct. In Perl, you should use /(.)$1/. The former syntax is there for compatibility with I think awk or sed but that should in general not be used in Perl, because Perl has more uses of backslash that may interfere with backtracking.
That is not correct. Using \1 is perfectly good perl code. \1 and $1 really have two seperate uses. See the link I posted in my previous post. A short test shows they do not do the same thing:

Code:
$_ = 'foobar';
if (/(.)$1/) {
   print "\$1 = $1","\n";
}	
if (/(.)\1/) {
   print "\\1 = $1";
}

output:

Code:
$1 = f
\1 = o

# 5  
Old 07-24-2008
Thanks everyone for your messages.

Also I found that re-reading my notes in better detail was very helpful !
# 6  
Old 07-24-2008
this does not work:

/[a-zA-Z]+/

because it means one or more of the characters inside the square brackets, any of the characters, in any order. You want to find two of the same character repeated in a string, not one or more of any character inside the [] brackets.
# 7  
Old 07-24-2008
Interesting and thoughtful question. You use "(" and ")" to mark (remember) a pattern and recall the remembered pattern with "\" followed by a single digit (back reference).

In your particular case, "(.)\1" means remember a character and recall the character.

You can extend this method to find words with multiple double letters. '(.)\1(.)\2(.)\3' will match any word with three double letters, e.g. bookkeeper.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl REGEX help

Experts - I found a script on one of the servers that I work on and I need help understanding one of the lines. I know what the script does, but I'm having a hard time understanding the grouping. Can someone help me with this? Here's the script... #!/usr/bin/perl use strict; use... (2 Replies)
Discussion started by: timj123
2 Replies

2. Shell Programming and Scripting

Perl, RegEx - Help me to understand the regex!

I am not a big expert in regex and have just little understanding of that language. Could you help me to understand the regular Perl expression: ^(?!if\b|else\b|while\b|)(?:+?\s+){1,6}(+\s*)\(*\) *?(?:^*;?+){0,10}\{ ------ This is regex to select functions from a C/C++ source and defined in... (2 Replies)
Discussion started by: alex_5161
2 Replies

3. Shell Programming and Scripting

?= in perl regex

Could anyone please make me understand how the ?= works below .. After executing this I am getting the same output. $string="I love chocolate."; $string =~ s/chocolate(?= ice)/vanilla/; print "$string\n"; (2 Replies)
Discussion started by: scriptscript
2 Replies

4. Programming

Perl regex

Hello, I'm trying to get a quick help on regex since i'm not a regular programmer. Below is the line i'm trying to apply my regex to..i want to use the regex in a for loop and this line will keep on changing. subject=... (4 Replies)
Discussion started by: jhamaks
4 Replies

5. Programming

Perl regex

Hi Guys I have the following regex $OSRELEASE = $1 if ($output =~ /(Mac OS X (Server )?10.\d)/); output is currently Mac OS X 10.7.5 when the introduction of Mac 10.8 output changes to OS X 10.8.2 they have dropped the Mac bit so i changed the regex to be (2 Replies)
Discussion started by: ab52
2 Replies

6. Programming

Perl regex

HI, I'm new to perl and need simple regex for reading a file using my perl script. The text file reads as - filename=/pot/uio/current/myremificates.txt certificates=/pot/uio/current/userdir/conf/user/gamma/settings/security/... (3 Replies)
Discussion started by: jhamaks
3 Replies

7. UNIX for Dummies Questions & Answers

Perl Regex Help!!!

Hi, I get the following when I cat a file *.log xxxxx ===== dasdas gwdgsg fdsagfsag agsdfag ===== random data ===== My output should look like : If the random data after the 2nd ==== is null then OK should be printed else the random data should be printed. How do I go about this... (5 Replies)
Discussion started by: manutd
5 Replies

8. Shell Programming and Scripting

Converting perl regex to sed regex

I am having trouble parsing rpm filenames in a shell script.. I found a snippet of perl code that will perform the task but I really don't have time to rewrite the entire script in perl. I cannot for the life of me convert this code into something sed-friendly: if ($rpm =~ /(*)-(*)-(*)\.(.*)/)... (1 Reply)
Discussion started by: suntzu
1 Replies

9. Shell Programming and Scripting

Perl regex

I have got numbers like l255677 l376039 l188144 l340482 l440700 l254113 to match the numbers starting with '13' what would be the regex =~/13(.*)/ =======>This is not working .... But for user123,user657 regex =~/user(.*)/ ========>works Thanks for help..!! (7 Replies)
Discussion started by: trina_1
7 Replies

10. Shell Programming and Scripting

Perl REGEX

Hi, Can anyone help me to find regular expression for the following in Perl? "The string can only contain lower case letters (a-z) and no more than one of any letter." For example: "table" is accepted, whether "dude" is not. I have coded like this: $str = "table"; if ($str =~ m/\b()\b/) {... (4 Replies)
Discussion started by: evilfreakz
4 Replies
Login or Register to Ask a Question