The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
null string matching in sed? Allasso Shell Programming and Scripting 6 05-30-2008 07:44 PM
Regex deepakpv Shell Programming and Scripting 6 03-28-2007 04:18 AM
Regex?? Please help lunac UNIX for Dummies Questions & Answers 7 01-30-2007 01:13 PM
compare null with non-null nitin Shell Programming and Scripting 8 11-04-2006 07:58 PM
find -regex: matching multiple extensions r0sc0 Shell Programming and Scripting 2 12-08-2005 02:32 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 06-05-2008
deckard's Avatar
deckard deckard is offline
Registered User
  
 

Join Date: Jul 2002
Location: Ohio
Posts: 135
[SOLVED]: REGEX: Matching Null?

I'm using the URL Regex feature of Squid for allowing sites via a list of regex strings to match allowed domains. The regex was actually copied from our previous proxy solution and it seemed to "just work". But, we've recently discovered that some domains (likely due to virtual hosts or host header configuration depending on if it's Apache or IIS respectively) fail if they are used without the www prefix in the URL. Below is an example of what sometimes works:

Code:
http://.*\.microsoft\.com/.*
The '.*\.' before the 'microsoft\.com' portion SHOULD mean, any number of any characters (zero or more) followed by a '.' I see the error in terms of the '\'. portion of the regex and plan to fix that. However, I've been unable to find a way to match both 'www.microsoft.com' and 'microsoft.com'. Here's what I thought would work:

Code:
http://[!.*|.*\.]microsoft\.com/.*
I admit to being really bad with regex, so please don't be too hard on me please. I've just never been able to "get it" 100%. Needless to say, the above doesn't work for me at all. It matches neither 'microsoft.com' nor 'www.microsoft.com'. I've tried some limited testing with 'grep' to try and find an adequate solution. But, what is it that I'm really trying to match? At first, I assumed I wanted a whitespace character, but I'm not looking for ' microsoft.com'. Then I thought, a null? But that seems to be impossible to match since it's not really a match at all since there's no character there. I'm sure someone who is an expert at regex would look at this and provide something insanely simple. I really don't want to do this:

Code:
http://[.*\.microsoft\.com/.*|microsoft\.com/.*]
or worse, this:

Code:
http://.*\.microsoft\.com/.*
http://microsoft\.com/.*
Any suggestions? Thanks in advance...

Last edited by deckard; 06-05-2008 at 10:57 AM.. Reason: Received a solution to the problem.
  #2 (permalink)  
Old 06-05-2008
spirtle spirtle is offline
Registered User
  
 

Join Date: Jun 2008
Location: Scotland
Posts: 150
I am unfamiliar with Squid, and maybe regexps work differently there, but it looks to me like you need the '?' operator which matches the preceding expression 0 or 1 times, e.g.
Code:
http://(www\.)?microsoft\.com/
does what you want when used as a grep argument.
  #3 (permalink)  
Old 06-05-2008
deckard's Avatar
deckard deckard is offline
Registered User
  
 

Join Date: Jul 2002
Location: Ohio
Posts: 135
Thanks!

Your suggestion wound up working for me. I changed all of my lines to the following format:

Code:
http://(.*\.)?microsoft\.com/.*
That seems to have worked well. I knew someone on here would find this to be a simple problem to solve.
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 08:05 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language translation by Google.
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0