Sponsored Content
Full Discussion: q with Perl Regex
Top Forums Shell Programming and Scripting q with Perl Regex Post 302217931 by cbkihong on Thursday 24th of July 2008 12:42:05 AM
Old 07-24-2008
Quote:
Originally Posted by JamesGoh
Now I know that . refers to any single character and the \1 refers to the first character in the line being read (if s/..../.... is being used), but Im still puzzled as to why /(.)\1/ works instead of /[a-zA-Z]+/ for the case of double letters ?
* Incorrect text removed *

/[a-zA-Z]+/ only means matching a contiguous sequence of letters, so not only 'AA' or 'zz' will match, 'Az' will match too.

Last edited by cbkihong; 07-24-2008 at 02:28 AM.. Reason: Incorrect text removed
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl REGEX

Hi, Can anyone help me to find regular expression for the following in Perl? "The string can only contain lower case letters (a-z) and no more than one of any letter." For example: "table" is accepted, whether "dude" is not. I have coded like this: $str = "table"; if ($str =~ m/\b()\b/) {... (4 Replies)
Discussion started by: evilfreakz
4 Replies

2. Shell Programming and Scripting

Perl regex

I have got numbers like l255677 l376039 l188144 l340482 l440700 l254113 to match the numbers starting with '13' what would be the regex =~/13(.*)/ =======>This is not working .... But for user123,user657 regex =~/user(.*)/ ========>works Thanks for help..!! (7 Replies)
Discussion started by: trina_1
7 Replies

3. Shell Programming and Scripting

Converting perl regex to sed regex

I am having trouble parsing rpm filenames in a shell script.. I found a snippet of perl code that will perform the task but I really don't have time to rewrite the entire script in perl. I cannot for the life of me convert this code into something sed-friendly: if ($rpm =~ /(*)-(*)-(*)\.(.*)/)... (1 Reply)
Discussion started by: suntzu
1 Replies

4. UNIX for Dummies Questions & Answers

Perl Regex Help!!!

Hi, I get the following when I cat a file *.log xxxxx ===== dasdas gwdgsg fdsagfsag agsdfag ===== random data ===== My output should look like : If the random data after the 2nd ==== is null then OK should be printed else the random data should be printed. How do I go about this... (5 Replies)
Discussion started by: manutd
5 Replies

5. Programming

Perl regex

HI, I'm new to perl and need simple regex for reading a file using my perl script. The text file reads as - filename=/pot/uio/current/myremificates.txt certificates=/pot/uio/current/userdir/conf/user/gamma/settings/security/... (3 Replies)
Discussion started by: jhamaks
3 Replies

6. Programming

Perl regex

Hi Guys I have the following regex $OSRELEASE = $1 if ($output =~ /(Mac OS X (Server )?10.\d)/); output is currently Mac OS X 10.7.5 when the introduction of Mac 10.8 output changes to OS X 10.8.2 they have dropped the Mac bit so i changed the regex to be (2 Replies)
Discussion started by: ab52
2 Replies

7. Programming

Perl regex

Hello, I'm trying to get a quick help on regex since i'm not a regular programmer. Below is the line i'm trying to apply my regex to..i want to use the regex in a for loop and this line will keep on changing. subject=... (4 Replies)
Discussion started by: jhamaks
4 Replies

8. Shell Programming and Scripting

?= in perl regex

Could anyone please make me understand how the ?= works below .. After executing this I am getting the same output. $string="I love chocolate."; $string =~ s/chocolate(?= ice)/vanilla/; print "$string\n"; (2 Replies)
Discussion started by: scriptscript
2 Replies

9. Shell Programming and Scripting

Perl, RegEx - Help me to understand the regex!

I am not a big expert in regex and have just little understanding of that language. Could you help me to understand the regular Perl expression: ^(?!if\b|else\b|while\b|)(?:+?\s+){1,6}(+\s*)\(*\) *?(?:^*;?+){0,10}\{ ------ This is regex to select functions from a C/C++ source and defined in... (2 Replies)
Discussion started by: alex_5161
2 Replies

10. Shell Programming and Scripting

Perl REGEX help

Experts - I found a script on one of the servers that I work on and I need help understanding one of the lines. I know what the script does, but I'm having a hard time understanding the grouping. Can someone help me with this? Here's the script... #!/usr/bin/perl use strict; use... (2 Replies)
Discussion started by: timj123
2 Replies
string(n)						       Tcl Built-In Commands							 string(n)

__________________________________________________________________________________________________________________________________________________

NAME
       string - Manipulate strings

SYNOPSIS
       string option arg ?arg ...?
_________________________________________________________________

DESCRIPTION
       Performs one of several string operations, depending on option.	The legal options (which may be abbreviated) are:			   |

       string bytelength string 														   |
	      Returns  a  decimal  string giving the number of bytes used to represent string in memory.  Because UTF-8 uses one to three bytes to |
	      represent Unicode characters, the byte length will not be the same as the character length in general.  The  cases  where  a  script |
	      cares  about  the  byte  length are rare.  In almost all cases, you should use the string length operation.  Refer to the Tcl_NumUt- |
	      fChars manual entry for more details on the UTF-8 representation. 								   |

       string compare ?-nocase? ?-length int? string1 string2											   |
	      Perform a character-by-character comparison of strings string1 and string2.  Returns -1, 0, or 1, depending on  whether  string1	is
	      lexicographically  less than, equal to, or greater than string2.	If -length is specified, then only the first length characters are |
	      used in the comparison.  If -length is negative, it is ignored.  If -nocase is specified, then the strings are compared in  a  case- |
	      insensitive manner.														   |

       string equal ?-nocase? ?-length int? string1 string2											   |
	      Perform  a  character-by-character  comparison of strings string1 and string2.  Returns 1 if string1 and string2 are identical, or 0 |
	      when not.  If -length is specified, then only the first length characters are used in the comparison.  If -length is negative, it is |
	      ignored.	If -nocase is specified, then the strings are compared in a case-insensitive manner.					   |

       string first string1 string2 ?startIndex?												   |
	      Search  string2  for a sequence of characters that exactly match the characters in string1.  If found, return the index of the first
	      character in the first such match within string2.  If not found, return -1.  If  startIndex  is  specified  (in  any  of	the  forms |
	      accepted	by  the  index method), then the search is constrained to start with the character in string2 specified by the index.  For |
	      example,																   |
		     string first a 0a23456789abcdef 5												   |
	      will return 10, but														   |
		     string first a 0123456789abcdef 11 											   |
	      will return -1.															   |

       string index string charIndex
	      Returns the charIndex'th character of the string argument.  A charIndex of 0 corresponds to  the	first  character  of  the  string. |
	      charIndex may be specified as follows:												   |

	      integer																   |
			The char specified at this integral index										   |

	      end																   |
			The last char of the string.												   |

	      end-integer															   |
			The last char of the string minus the specified integer offset (e.g. end-1 would refer to the "c" in "abcd").		   |

	      If charIndex is less than 0 or greater than or equal to the length of the string then an empty string is returned.		   |

       string is class ?-strict? ?-failindex varname? string											   |
	      Returns 1 if string is a valid member of the specified character class, otherwise returns 0.  If -strict is specified, then an empty |
	      string returns 0, otherwise and empty string will return 1 on any class.	If -failindex is specified, then if the  function  returns |
	      0,  the  index in the string where the class was no longer valid will be stored in the variable named varname.  The varname will not |
	      be set if the function returns 1.  The following character classes are recognized (the class name can be abbreviated):		   |

	      alnum																   |
			Any Unicode alphabet or digit character.										   |

	      alpha																   |
			Any Unicode alphabet character. 											   |

	      ascii																   |
			Any character with a value less than u0080 (those that are in the 7-bit ascii range).					   |

	      boolean																   |
			Any of the forms allowed to Tcl_GetBoolean.										   |

	      control																   |
			Any Unicode control character.												   |

	      digit																   |
			Any Unicode digit character.  Note that this includes characters outside of the [0-9] range.				   |

	      double																   |
			Any of the valid forms for a double in Tcl, with optional surrounding whitespace.  In case of under/overflow in the value, |
			0 is returned and the varname will contain -1.										   |

	      false																   |
			Any of the forms allowed to Tcl_GetBoolean where the value is false.							   |

	      graph																   |
			Any Unicode printing character, except space.										   |

	      integer																   |
			Any  of  the  valid  forms  for an integer in Tcl, with optional surrounding whitespace.  In case of under/overflow in the |
			value, 0 is returned and the varname will contain -1.									   |

	      lower																   |
			Any Unicode lower case alphabet character.										   |

	      print																   |
			Any Unicode printing character, including space.									   |

	      punct																   |
			Any Unicode punctuation character.											   |

	      space																   |
			Any Unicode space character.												   |

	      true																   |
			Any of the forms allowed to Tcl_GetBoolean where the value is true.							   |

	      upper																   |
			Any upper case alphabet character in the Unicode character set. 							   |

	      wordchar																   |
			Any Unicode word character.  That is any alphanumeric character, and any Unicode connector  punctuation  characters  (e.g. |
			underscore).														   |

	      xdigit																   |
			Any hexadecimal digit character ([0-9A-Fa-f]).										   |

	      In  the  case of boolean, true and false, if the function will return 0, then the varname will always be set to 0, due to the varied |
	      nature of a valid boolean value.													   |

       string last string1 string2 ?startIndex? 												   |
	      Search string2 for a sequence of characters that exactly match the characters in string1.  If found, return the index of	the  first
	      character  in  the last such match within string2.  If there is no match, then return -1.  If startIndex is specified (in any of the |
	      forms accepted by the index method), then only the characters in string2 at or before the specified startIndex will be considered by |
	      the search.  For example, 													   |
		     string last a 0a23456789abcdef 15												   |
	      will return 10, but														   |
		     string last a 0a23456789abcdef 9												   |
	      will return 1.															   |

       string length string
	      Returns  a  decimal  string  giving the number of characters in string.  Note that this is not necessarily the same as the number of
	      bytes used to store the string.													   |

       string map ?-nocase? charMap string													   |
	      Replaces characters in string based on the key-value pairs in charMap.  charMap is a list of key value key value ...  as in the form |
	      returned	by  array  get.  Each instance of a key in the string will be replaced with its corresponding value.  If -nocase is speci- |
	      fied, then matching is done without regard to case differences. Both key and value may be multiple characters.  Replacement is  done |
	      in  an  ordered manner, so the key appearing first in the list will be checked first, and so on.	string is only iterated over once, |
	      so earlier key replacements will have no affect for later key matches.  For example,						   |
		     string map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc										   |
	      will return the string 01321221.													   |

       string match ?-nocase? pattern string													   |
	      See if pattern matches string; return 1 if it does, 0 if it doesn't.  If -nocase is specified, then the pattern  attempts  to  match |
	      against  the  string  in	a case insensitive manner.  For the two strings to match, their contents must be identical except that the
	      following special sequences may appear in pattern:

	      * 	Matches any sequence of characters in string, including a null string.

	      ? 	Matches any single character in string.

	      [chars]	Matches any character in the set given by chars.  If a sequence of the form x-y  appears  in  chars,  then  any  character
			between  x  and y, inclusive, will match.  When used with -nocase, the end points of the range are converted to lower case |
			first.	Whereas {[A-z]} matches '_' when matching case-sensitively ('_' falls between the 'Z' and 'a'), with -nocase  this |
			is considered like {[A-Za-z]} (and probably what was meant in the first place).

	      x	Matches  the  single  character  x.  This provides a way of avoiding the special interpretation of the characters *?[] in
			pattern.

       string range string first last
	      Returns a range of consecutive characters from string, starting with the character whose index is first and ending with the  charac-
	      ter  whose  index  is  last.  An index of 0 refers to the first character of the string.	first and last may be specified as for the |
	      index method.  If first is less than zero then it is treated as if it were zero, and if last is greater than or equal to the  length
	      of the string then it is treated as if it were end.  If first is greater than last then an empty string is returned.		   |

       string repeat string count														   |
	      Returns string repeated count number of times.											   |

       string replace string first last ?newstring?												   |
	      Removes  a range of consecutive characters from string, starting with the character whose index is first and ending with the charac- |
	      ter whose index is last.	An index of 0 refers to the first character of the string.  First and last may be  specified  as  for  the |
	      index  method.   If newstring is specified, then it is placed in the removed character range.  If first is less than zero then it is |
	      treated as if it were zero, and if last is greater than or equal to the length of the string then it is treated as if it	were  end. |
	      If  first  is  greater  than  last  or the length of the initial string, or last is less than 0, then the initial string is returned |
	      untouched.															   |

       string tolower string ?first? ?last?													   |
	      Returns a value equal to string except that all upper (or title) case letters have been converted to lower case.	If first is speci- |
	      fied,  it refers to the first char index in the string to start modifying.  If last is specified, it refers to the char index in the |
	      string to stop at (inclusive).  first and last may be specified as for the index method.						   |

       string totitle string ?first? ?last?													   |
	      Returns a value equal to string except that the first character in string is converted to its Unicode title case variant	(or  upper |
	      case  if there is no title case variant) and the rest of the string is converted to lower case.  If first is specified, it refers to |
	      the first char index in the string to start modifying.  If last is specified, it refers to the char index in the string to  stop	at |
	      (inclusive).  first and last may be specified as for the index method.								   |

       string toupper string ?first? ?last?													   |
	      Returns a value equal to string except that all lower (or title) case letters have been converted to upper case.	If first is speci- |
	      fied, it refers to the first char index in the string to start modifying.  If last is specified, it refers to the char index in  the |
	      string to stop at (inclusive).  first and last may be specified as for the index method.

       string trim string ?chars?
	      Returns a value equal to string except that any leading or trailing characters from the set given by chars are removed.  If chars is
	      not specified then white space is removed (spaces, tabs, newlines, and carriage returns).

       string trimleft string ?chars?
	      Returns a value equal to string except that any leading characters from the set given by chars are removed.  If chars is not  speci-
	      fied then white space is removed (spaces, tabs, newlines, and carriage returns).

       string trimright string ?chars?
	      Returns a value equal to string except that any trailing characters from the set given by chars are removed.  If chars is not speci-
	      fied then white space is removed (spaces, tabs, newlines, and carriage returns).							   |

       string wordend string charIndex														   |
	      Returns the index of the character just after the last one in the word containing character charIndex of string.	charIndex  may	be |
	      specified as for the index method.  A word is considered to be any contiguous range of alphanumeric (Unicode letters or decimal dig- |
	      its) or underscore (Unicode connector punctuation) characters, or any single character other than these.				   |

       string wordstart string charIndex													   |
	      Returns the index of the first character in the word containing character charIndex of string.  charIndex may be	specified  as  for |
	      the index method.  A word is considered to be any contiguous range of alphanumeric (Unicode letters or decimal digits) or underscore |
	      (Unicode connector punctuation) characters, or any single character other than these.

SEE ALSO
       expr(n), list(n)

KEYWORDS
       case conversion, compare, index, match, pattern, string, word, equal, ctype

Tcl									8.1								 string(n)
All times are GMT -4. The time now is 02:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy