![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Perl regex question | figaro | Shell Programming and Scripting | 10 | 07-18-2008 12:45 AM |
| Perl regex help - matching parentheses | cvp | Shell Programming and Scripting | 4 | 06-25-2008 11:45 AM |
| how do i strip this line using perl regex. | ramky79 | Shell Programming and Scripting | 1 | 03-18-2008 08:10 AM |
| Regex | deepakpv | Shell Programming and Scripting | 6 | 03-28-2007 01:18 AM |
| sed regex | Shakey21 | UNIX for Dummies Questions & Answers | 4 | 01-31-2002 05:16 PM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
q with Perl Regex
For a programming exercise, I am mean to design a Perl script that detects double letters in a text file.
I tried the following expressions Code:
# Check for any double letter within the alphabet /[a-zA-Z]+/ # Check for any repetition of an alphanumeric character /\w+/ Also Code:
/[a-zA-Z]{1}/
After doing some searching, I stumbled across the correct form of the regex for the double letter case. It turned out to be Code:
/(.)\1/ many thanks James |
| Forum Sponsor | ||
|
|
|
|||
|
Quote:
/[a-zA-Z]+/ only means matching a contiguous sequence of letters, so not only 'AA' or 'zz' will match, 'Az' will match too. Last edited by cbkihong; 07-23-2008 at 10:28 PM. Reason: Incorrect text removed |
|
|||
|
\1 is a backreference to what is matched in the parenthesis in the regexp. So /(.)\1/ finds a double occurance of whatever (.) matched. It is similar to $1 but is used inside the regexp. It is discussed in some detail here:
perlretut - perldoc.perl.org |
|
|||
|
Quote:
Code:
$_ = 'foobar';
if (/(.)$1/) {
print "\$1 = $1","\n";
}
if (/(.)\1/) {
print "\\1 = $1";
}
Code:
$1 = f \1 = o |
|
|||
|
this does not work:
/[a-zA-Z]+/ because it means one or more of the characters inside the square brackets, any of the characters, in any order. You want to find two of the same character repeated in a string, not one or more of any character inside the [] brackets. |
|
|||
|
Interesting and thoughtful question. You use "(" and ")" to mark (remember) a pattern and recall the remembered pattern with "\" followed by a single digit (back reference).
In your particular case, "(.)\1" means remember a character and recall the character. You can extend this method to find words with multiple double letters. '(.)\1(.)\2(.)\3' will match any word with three double letters, e.g. bookkeeper. |
|||
| Google The UNIX and Linux Forums |
| Tags |
| perl, perl regex, regex |
| Thread Tools | |
| Display Modes | |
|
|