Grep with regex containing one string but not the other

04-09-2015

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Greedy matching is doing you in. All "(?!foobuzz)..." needs to find is one spot before user1 which doesn't start in 'f' and its condition is satisfied. The .* after it swallows everything. So it scans the first character, goes 'yay its not foobuzz', searches for the username and accepts.

So, you need to force the regex to check for (?!foobuzz) in a relevant spot. Right after a | character perhaps. And you have to make it check all of them so it won't cheat by skipping ahead.

You can match entire fields with [^|]*\|, and force them to not start with foobuzz via (?!foobuzz)[^|]*\|, match several in a row by wrapping it in ()* and force it to start at the beginning of the line with ^.

So ^((?!foobuzz)[^|]*\|)*UserID1 will only accept a line containing zero or more | fields, none of which begin with foobuzz, after which it must immediately find 'userid1'.

Code:

$ printf "2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuzz|          |INFO |REQUEST|UserID1:23 | ohnooo\n" |
        grep -P '^((?!foobuzz)[^|]*\|)*UserID1'

$ printf "2015-04-08 19:04:56,157|yyyyyyyyyy|          |frobuzz|          |INFO |REQUEST|UserID1:23 | ohnooo\n" | grep -P '^((?!foobuzz)[^|]*\|)*UserID1'
2015-04-08 19:04:56,157|yyyyyyyyyy|          |frobuzz|          |INFO |REQUEST|UserID1:23 | ohnooo

I don't think rewriting it in awk or sed would necessarily make it slower, but if you have a big pile of regexes already, could be a painful amount of work.

Last edited by Corona688; 04-09-2015 at 01:29 PM..

These 2 Users Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

04-10-2015

Registered User

13, 0

Join Date: Jan 2008

Last Activity: 21 July 2017, 7:07 AM EDT

Posts: 13

Thanks Given: 8

Thanked 0 Times in 0 Posts

Thank you for your explanation! Now I've got it!

Quote:

Originally Posted by Corona688

... rewriting ... could be a painful amount of work.

Yeah, you could bet on it...

---------- Post updated 04-10-15 at 04:56 PM ---------- Previous update was 04-09-15 at 07:23 PM ----------

D'oh, I cheered to early. The one-liner did the job, but now grep surprised me by saying "grep: the -P option only supports a single pattern". Ok, I will find a different solution by means of postprocessing.

stresing

View Public Profile for stresing

Find all posts by stresing

04-10-2015

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Well, if you're rewriting from scratch anyway, you could have a file full of awk expressions, like:

Code:

# Do NOT print line if this regex matches
/regex1/ { next }

# print line if it contains this literal string.  Faster than regex
index($0, "literalstring") { print ; next }

# print line if one regex matches and another does not
/regex1/ && !/regex2/ { print ; next }

# print line if any of these regexes match
/regex1/ || /regex2/ || /regex3/ || /regex4/ { print ; next }

etc..

You can use that file like

Code:

awk -f expressions.awk

with a filename optionally specified afterwards.

There is also a variant of awk that's further optimized for speed, mawk.

This User Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

04-10-2015

Registered User

2,288, 480

Join Date: Apr 2007

Last Activity: 3 May 2020, 8:28 AM EDT

Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris

Posts: 2,288

Thanks Given: 430

Thanked 480 Times in 395 Posts

Hi.

There is a code, peg, that may do what you desire: multiple patterns of PERLRE expressions. Using the pattern from disedorgue and an augmented data file, a demo is:

Code:

#!/usr/bin/env bash

# @(#) s2	Demonstrate complex matching expressions: peg
# peg (various versions):
# http://www.cpan.org/authors/id/A/AD/ADAVIES/

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C peg

FILE=${1-data2}

pl " Input data file $FILE:"
cat $FILE

pl " Results:"
peg -e garble -e '\|(?!foobuzz)[^|]*\|[^|]*\|[^|]*\|[^|]*\|UserID1' $FILE

exit 0

producing:

Code:

$ ./s2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
peg (local) 3.10

-----
 Input data file data2:
2015-04-08 19:04:55,926|xxxxxxxxxx|          |foobar|          |INFO |REQUEST|UserID1:42 | yeah
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuzz|          |INFO |REQUEST|UserID1:23 | ohnooo
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuz|          |INFO |REQUEST|UserID1:23 | ohnooo
garble

-----
 Results:
2015-04-08 19:04:55,926|xxxxxxxxxx|          |foobar|          |INFO |REQUEST|UserID1:42 | yeah
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuz|          |INFO |REQUEST|UserID1:23 | ohnooo
garble

Here is what -e is defined as:

Code:

       -e overloaded
           -e PERLEXPR
               Specify a PERLEXPR to match.

               If used more than once, then it is equivalent to using -o.  For
               example, "peg -e foo -e bar baz", "peg -o foo bar -- baz", and
               "peg "/foo/ or /bar/" baz" are all equivalent.

Best wishes ... cheers, drl

This User Gave Thanks to drl For This Post:

drl

View Public Profile for drl

Find all posts by drl

04-10-2015

Registered User

503, 195

Join Date: Sep 2013

Last Activity: 22 January 2021, 1:52 PM EST

Location: France

Posts: 503

Thanks Given: 43

Thanked 195 Times in 176 Posts

maybe a way with grep with option "-v", example:
my regex file:

Code:

$ cat reg.gr 
!UserID1
foobuzz

my file to parse:

Code:

$ cat gr.txt 
2015-04-08 19:04:55,926|xxxxxxxxxx|          |foobar|          |INFO |REQUEST|UserID1:42 | yeah
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuzz|          |INFO |REQUEST|UserID1:23 | ohnooo
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuz|          |INFO |REQUEST|UserID1:23 | ohnooo

The command grep:

Code:

$ grep -vf reg.gr gr.txt 
2015-04-08 19:04:55,926|xxxxxxxxxx|          |foobar|          |INFO |REQUEST|UserID1:42 | yeah
2015-04-08 19:04:56,157|yyyyyyyyyy|          |foobuz|          |INFO |REQUEST|UserID1:23 | ohnooo

$

Edit: No, it's wrong... not work
Regards.

Last edited by disedorgue; 04-10-2015 at 03:29 PM..

disedorgue

View Public Profile for disedorgue

Find all posts by disedorgue

Shell Programming and Scripting

Grep with regex containing one string but not the other

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep string with regex numeric characters

Discussion started by: nms

2. Shell Programming and Scripting

grep regex, match exact string which includes "/" anywhere on line.

Discussion started by: jelloir

3. Shell Programming and Scripting

grep fixed string with regex

Discussion started by: teresaejunior

4. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

Discussion started by: kchinnam

5. UNIX for Dummies Questions & Answers

Regex to match when input is not a certain string (can't use grep -v)

Discussion started by: kdelok

6. UNIX for Dummies Questions & Answers

| help | unix | grep (GNU grep) 2.5.1 | advanced regex syntax

Discussion started by: MykC

7. UNIX for Dummies Questions & Answers

Help with grep and regex

Discussion started by: raichlea

8. UNIX for Dummies Questions & Answers

grep with Regex help!

Discussion started by: mvalonso

9. Shell Programming and Scripting

sed, grep, awk, regex -- extracting a matched substring from a file/string

Discussion started by: ropers

10. UNIX for Dummies Questions & Answers

use of regex on grep

Discussion started by: solea