More Grep - Regular Expressions

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

grep and regular expressions

Hi All, For the past many days I have solved a lot of grep and regular expression questions, Now I am in a search for a good quality set of questions that can help me build and check my knowledge of grep with regular expressions, it would be great if anyone could help me with my requirement. ...

2. Homework & Coursework Questions

Regular Expressions with GREP

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Given a text file (big_english.txt) containing roughly 250,000 words, answer the following using grep and...

3. UNIX for Dummies Questions & Answers

extract columns using grep or regular expressions

I am trying to print columns from a table whose name (header) matches a certain string. E.g., patient1001 patient1002 patient2005 patient3005 patient4001 0 0 0 0 0 2 9 2 8 3 2 7 3 0 2 Say I want to print columns whose names end with "01" patient1001 patient4001 0 0 2 3 2 2 ...

4. Shell Programming and Scripting

Regular Expressions

Hi Ilove unix and alwyas trying to to learn unix,but i am weak in using regular expressions.can you please give me a littel brief discription that how can i understand them and how to use .your response could lead a great hand in my unix love.

5. Shell Programming and Scripting

Regular Expressions

I am new to shell scripts.Can u please help me on this req. test_user = "Arun" if echo "test_user is a word" else echo "test_user is not a word"

6. Shell Programming and Scripting

How to grep using a line break in regular expressions?

Hi, I have a file as below, {#### if file then file else file } print file i need to fine the count of all the pattern - file, inside the { } i'm using a grep command as grep -c \{'*file*'\} fake.sh\ It doesn't gives me any result, i think the problem here is the...

7. Shell Programming and Scripting

Regular Expressions

#!/usr/bin/perl $word = "one last challenge"; if ( $word =~ /^(\w+).*\s(\w+)$/ ) { print "$1"; print "\n"; print "$2"; } The output shows that "$1" is with result one and "$2" is with result challenge. I am confused about how this pattern match expression works step by step. I...

8. UNIX for Dummies Questions & Answers

Regular expressions

In regular expressions with grep(or egrep), ^ works if we want something in starting of line..but what if we write ^^^ or ^ for pattern matching??..Hope u all r familiar with regular expressions for pattern matching..

9. Shell Programming and Scripting

searching regular expressions with special characters like dot using grep

hi everybody I am a new user to this forum and its previous posts have been very useful. I'm searching in a file using grep for patterns like 12.13.444 55.44.443 i.e. of form <digit><digit>.<digit><digit>.<digit><digit><digit> Can anybody help me with this. Thanks in advance

10. UNIX for Dummies Questions & Answers

grep and regular expressions :

I wrote a simple korn shell where I am trying to filter all the good record layouts of a file to only leave the bad ones to look at. That file is hudge. Aside from '# comments' and 'var=ssss', all record should follow a specific record layout, with comma seperated fields. Some fields can have any...

LEARN ABOUT DEBIAN

search::queryparser

Search::QueryParser(3pm)				User Contributed Perl Documentation				  Search::QueryParser(3pm)

NAME

       Search::QueryParser - parses a query string into a data structure suitable for external search engines

SYNOPSIS

	 my $qp = new Search::QueryParser;
	 my $s = '+mandatoryWord -excludedWord +field:word "exact phrase"';
	 my $query = $qp->parse($s)  or die "Error in query : " . $qp->err;
	 $someIndexer->search($query);

	 # query with comparison operators and implicit plus (second arg is true)
	 $query = $qp->parse("txt~'^foo.*' date>='01.01.2001' date<='02.02.2002'", 1);

	 # boolean operators (example below is equivalent to "+a +(b c) -d")
	 $query = $qp->parse("a AND (b OR c) AND NOT d");

	 # subset of rows
	 $query = $qp->parse("Id#123,444,555,666 AND (b OR c)");

DESCRIPTION

       This module parses a query string into a data structure to be handled by external search engines.  For examples of such engines, see
       File::Tabular and Search::Indexer.

       The query string can contain simple terms, "exact phrases", field names and comparison operators, '+/-' prefixes, parentheses, and boolean
       connectors.

       The parser can be parameterized by regular expressions for specific notions of "term", "field name" or "operator" ; see the new method. The
       parser has no support for lemmatization or other term transformations : these should be done externally, before passing the query data
       structure to the search engine.

       The data structure resulting from a parsed query is a tree of terms and operators, as described below in the parse method.  The
       interpretation of the structure is up to the external search engine that will receive the parsed query ; the present module does not make
       any assumption about what it means to be "equal" or to "contain" a term.

QUERY STRING

       The query string is decomposed into "items", where each item has an optional sign prefix, an optional field name and comparison operator,
       and a mandatory value.

   Sign prefix
       Prefix '+' means that the item is mandatory.  Prefix '-' means that the item must be excluded.  No prefix means that the item will be
       searched for, but is not mandatory.

       As far as the result set is concerned, "+a +b c" is strictly equivalent to "+a +b" : the search engine will return documents containing
       both terms 'a' and 'b', and possibly also term 'c'. However, if the search engine also returns relevance scores, query "+a +b c" might give
       a better score to documents containing also term 'c'.

       See also section "Boolean connectors" below, which is another way to combine items into a query.

   Field name and comparison operator
       Internally, each query item has a field name and comparison operator; if not written explicitly in the query, these take default values ''
       (empty field name) and ':' (colon operator).

       Operators have a left operand (the field name) and a right operand (the value to be compared with); for example, "foo:bar" means "search
       documents containing term 'bar' in field 'foo'", whereas "foo=bar" means "search documents where field 'foo' has exact value 'bar'".

       Here is the list of admitted operators with their intended meaning :

       ":" treat value as a term to be searched within field.  This is the default operator.

       "~" or "=~"
	   treat value as a regex; match field against the regex.

       "!~"
	   negation of above

       "==" or "=", "<=", ">=", "!=", "<", ">"
	   classical relational operators

       "#" Inclusion in the set of comma-separated integers supplied on the right-hand side.

       Operators ":", "~", "=~", "!~" and "#" admit an empty left operand (so the field name will be '').  Search engines will usually interpret
       this as "any field" or "the whole data record".

   Value
       A value (right operand to a comparison operator) can be

       o   just a term (as recognized by regex "rxTerm", see new method below)

       o   A quoted phrase, i.e. a collection of terms within single or double quotes.

	   Quotes can be used not only for "exact phrases", but also to prevent misinterpretation of some values : for example "-2" would mean
	   "value '2' with prefix '-'", in other words "exclude term '2'", so if you want to search for value -2, you should write "-2" instead.
	   In the last example of the synopsis, quotes were used to prevent splitting of dates into several search terms.

       o   a subquery within parentheses.  Field names and operators distribute over parentheses, so for example "foo:(bar bie)" is equivalent to
	   "foo:bar foo:bie".  Nested field names such as "foo:(bar:bie)" are not allowed.  Sign prefixes do not distribute : "+(foo bar) +bie" is
	   not equivalent to "+foo +bar +bie".

   Boolean connectors
       Queries can contain boolean connectors 'AND', 'OR', 'NOT' (or their equivalent in some other languages).  This is mere syntactic sugar for
       the '+' and '-' prefixes : "a AND b" is translated into "+a +b"; "a OR b" is translated into "(a b)"; "NOT a" is translated into "-a".  "+a
       OR b" does not make sense, but it is translated into "(a b)", under the assumption that the user understands "OR" better than a '+' prefix.
       "-a OR b" does not make sense either, but has no meaningful approximation, so it is rejected.

       Combinations of AND/OR clauses must be surrounded by parentheses, i.e. "(a AND b) OR c" or "a AND (b OR c)" are allowed, but "a AND b OR c"
       is not.

METHODS

       new
	     new(rxTerm   => qr/.../, rxOp => qr/.../, ...)

	   Creates a new query parser, initialized with (optional) regular expressions :

	   rxTerm
	       Regular expression for matching a term.	Of course it should not match the empty string.  Default value is "qr/[^s()]+/".  A term
	       should not be allowed to include parenthesis, otherwise the parser might get into trouble.

	   rxField
	       Regular expression for matching a field name.  Default value is "qr/w+/" (meaning of "w" according to "use locale").

	   rxOp
	       Regular expression for matching an operator.  Default value is "qr/==|<=|>=|!=|=~|!~|:|=|<|>|~/".  Note that the longest operators
	       come first in the regex, because "alternatives are tried from left to right" (see "Version 8 Regular Expressions" in perlre) : this
	       is to avoid "a<=3" being parsed as "a < '=3'".

	   rxOpNoField
	       Regular expression for a subset of the operators which admit an empty left operand (no field name).  Default value is
	       "qr/=~|!~|~|:/".  Such operators can be meaningful for comparisons with "any field" or with "the whole record" ; the precise
	       interpretation depends on the search engine.

	   rxAnd
	       Regular expression for boolean connector AND.  Default value is "qr/AND|ET|UND|E/".

	   rxOr
	       Regular expression for boolean connector OR.  Default value is "qr/OR|OU|ODER|O/".

	   rxNot
	       Regular expression for boolean connector NOT.  Default value is "qr/NOT|PAS|NICHT|NON/".

	   defField
	       If no field is specified in the query, use defField.  The default is the empty string "".

       parse
	     $q = $queryParser->parse($queryString, $implicitPlus);

	   Returns a data structure corresponding to the parsed string.  The second argument is optional; if true, it adds an implicit '+' in
	   front of each term without prefix, so "parse("+a b c -d", 1)" is equivalent to "parse("+a +b +c -d")".  This is often seen in common
	   WWW search engines as an option "match all words".

	   The return value has following structure :

	     { '+' => [{field=>'f1', op=>':', value=>'v1', quote=>'q1'},
		       {field=>'f2', op=>':', value=>'v2', quote=>'q2'}, ...],
	       ''  => [...],
	       '-' => [...]
	     }

	   In other words, it is a hash ref with 3 keys '+', '' and '-', corresponding to the 3 sign prefixes (mandatory, ordinary or excluded
	   items). Each key holds either a ref to an array of items, or "undef" (no items with this prefix in the query).

	   An item is a hash ref containing

	   "field"
	       scalar, field name (may be the empty string)

	   "op"
	       scalar, operator

	   "quote"
	       scalar, character that was used for quoting the value ('"', "'" or undef)

	   "value"
	       Either

	       o   a scalar (simple term), or

	       o   a recursive ref to another query structure. In that case, "op" is necessarily '()' ; this corresponds to a subquery in
		   parentheses.

	   In case of a parsing error, "parse" returns "undef"; method err can be called to get an explanatory message.

       err
	     $msg = $queryParser->err;

	   Message describing the last parse error

       unparse
	     $s = $queryParser->unparse($query);

	   Returns a string representation of the $query data structure.

AUTHOR

       Laurent Dami, <laurent.dami AT etat ge ch>

COPYRIGHT AND LICENSE

       Copyright (C) 2005, 2007 by Laurent Dami.

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.14.2							    2009-09-30						  Search::QueryParser(3pm)

UNIX for Dummies Questions & Answers