Unix/Linux Go Back    


X11R7.4 - man page for pcregrep (x11r4 section 1)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)


PCREGREP(1)									      PCREGREP(1)

NAME
       pcregrep - a grep with Perl-compatible regular expressions.

SYNOPSIS
       pcregrep [options] [long options] [pattern] [path1 path2 ...]

DESCRIPTION

       pcregrep searches files for character patterns, in the same way as other grep commands do,
       but it uses the PCRE regular expression library to support patterns  that  are  compatible
       with  the regular expressions of Perl 5. See pcrepattern(3) for a full description of syn-
       tax and semantics of the regular expressions that PCRE supports.

       Patterns, whether supplied on the command line or in a separate file,  are  given  without
       delimiters. For example:

	 pcregrep Thursday /etc/motd

       If  you	attempt to use delimiters (for example, by surrounding a pattern with slashes, as
       is common in Perl scripts), they are interpreted as part of the	pattern.  Quotes  can  of
       course be used to delimit patterns on the command line because they are interpreted by the
       shell, and indeed they are required if a pattern contains white space or shell metacharac-
       ters.

       The first argument that follows any option settings is treated as the single pattern to be
       matched when neither -e nor -f is present.  Conversely, when one or both of these  options
       are used to specify patterns, all arguments are treated as path names. At least one of -e,
       -f, or an argument pattern must be provided.

       If no files are specified, pcregrep reads the standard input. The standard input can  also
       be referenced by a name consisting of a single hyphen.  For example:

	 pcregrep some-pattern /file1 - /file3

       By  default,  each  line  that  matches a pattern is copied to the standard output, and if
       there is more than one file, the file name is output at the start of each  line,  followed
       by  a  colon. However, there are options that can change how pcregrep behaves. In particu-
       lar, the -M option makes it possible to search for patterns  that  span	line  boundaries.
       What defines a line boundary is controlled by the -N (--newline) option.

       Patterns  are  limited  to  8K  or BUFSIZ characters, whichever is the greater.	BUFSIZ is
       defined in <stdio.h>. When there is more than one pattern (specified  by  the  use  of  -e
       and/or  -f),  each pattern is applied to each line in the order in which they are defined,
       except that all the -e patterns are tried before the -f patterns.

       By default, as soon as one pattern matches (or fails to match when -v is used), no further
       patterns  are considered. However, if --colour (or --color) is used to colour the matching
       substrings, or if --only-matching, --file-offsets, or --line-offsets  is  used  to  output
       only the part of the line that matched (either shown literally, or as an offset), scanning
       resumes immediately following the match, so that further matches on the same line  can  be
       found.  If  there  are multiple patterns, they are all tried on the remainder of the line,
       but patterns that follow the one that matched are not tried on the  earlier  part  of  the
       line.

       This  is the same behaviour as GNU grep, but it does mean that the order in which multiple
       patterns are specified can affect the output when one of the above options is used.

       Patterns that can match an empty string are accepted, but empty	string	matches  are  not
       recognized.  An	example  is  the  pattern  "(super)?(man)?",  in which all components are
       optional. This pattern finds all occurrences of both "super" and "man"; the output differs
       from matching with "super|man" when only the matching substrings are being shown.

       If  the	LC_ALL	or LC_CTYPE environment variable is set, pcregrep uses the value to set a
       locale when calling the PCRE library.  The --locale option can be used to override this.

SUPPORT FOR COMPRESSED FILES

       It is possible to compile pcregrep so that it uses libz or  libbz2  to  read  files  whose
       names  end  in .gz or .bz2, respectively. You can find out whether your binary has support
       for one or both of these file types by running it with the --help option. If the appropri-
       ate  support is not present, files are treated as plain text. The standard input is always
       so treated.

OPTIONS

       --	 This terminate the list of options. It is useful if the next item on the command
		 line  starts  with a hyphen but is not an option. This allows for the processing
		 of patterns and filenames that start with hyphens.

       -A number, --after-context=number
		 Output number lines of context after each matching  line.  If	filenames  and/or
		 line numbers are being output, a hyphen separator is used instead of a colon for
		 the context lines. A line containing "--" is output between each group of lines,
		 unless  they  are  in	fact contiguous in the input file. The value of number is
		 expected to be relatively small. However, pcregrep guarantees to have up  to  8K
		 of following text available for context output.

       -B number, --before-context=number
		 Output  number  lines	of context before each matching line. If filenames and/or
		 line numbers are being output, a hyphen separator is used instead of a colon for
		 the context lines. A line containing "--" is output between each group of lines,
		 unless they are in fact contiguous in the input file. The  value  of  number  is
		 expected  to  be relatively small. However, pcregrep guarantees to have up to 8K
		 of preceding text available for context output.

       -C number, --context=number
		 Output number lines of context both before and after each matching  line.   This
		 is equivalent to setting both -A and -B to the same value.

       -c, --count
		 Do  not  output  individual  lines; instead just output a count of the number of
		 lines that would otherwise have been output. If several files are given, a count
		 is  output  for  each	of  them.  In  this  mode, the -A, -B, and -C options are
		 ignored.

       --colour, --color
		 If this option is given without any data, it is equivalent  to  "--colour=auto".
		 If  data  is  required, it must be given in the same shell item, separated by an
		 equals sign.

       --colour=value, --color=value
		 This option specifies under what circumstances the parts of a line that  matched
		 a  pattern  should  be  coloured  in  the  output. By default, the output is not
		 coloured. The value (which is optional, see above) may be "never", "always",  or
		 "auto".  In  the  latter  case, colouring happens only if the standard output is
		 connected to a terminal. More resources are  used  when  colouring  is  enabled,
		 because pcregrep has to search for all possible matches in a line, not just one,
		 in order to colour them all.

		 The colour that is used can be specified by  setting  the  environment  variable
		 PCREGREP_COLOUR or PCREGREP_COLOR. The value of this variable should be a string
		 of two numbers, separated by a semicolon. They are copied directly into the con-
		 trol  string  for  setting colour on a terminal, so it is your responsibility to
		 ensure that they make sense. If neither of the environment variables is set, the
		 default is "1;31", which gives red.

       -D action, --devices=action
		 If an input path is not a regular file or a directory, "action" specifies how it
		 is to be processed. Valid values are "read" (the default)  or	"skip"	(silently
		 skip the path).

       -d action, --directories=action
		 If  an  input path is a directory, "action" specifies how it is to be processed.
		 Valid values are "read" (the default), "recurse" (equivalent to the -r  option),
		 or "skip" (silently skip the path). In the default case, directories are read as
		 if they were ordinary files. In some operating systems the effect of  reading	a
		 directory like this is an immediate end-of-file.

       -e pattern, --regex=pattern, --regexp=pattern
		 Specify a pattern to be matched. This option can be used multiple times in order
		 to specify several patterns. It can also be used as a way of specifying a single
		 pattern that starts with a hyphen. When -e is used, no argument pattern is taken
		 from the command line; all arguments are treated as  file  names.  There  is  an
		 overall  maximum  of 100 patterns. They are applied to each line in the order in
		 which they are defined until one matches (or fails to match if -v is  used).  If
		 -f is used with -e, the command line patterns are matched first, followed by the
		 patterns from the file, independent of the order  in  which  these  options  are
		 specified. Note that multiple use of -e is not the same as a single pattern with
		 alternatives. For example, X|Y finds the first character in a line that is X  or
		 Y,  whereas  if the two patterns are given separately, pcregrep finds X if it is
		 present, even if it follows Y in the line. It finds Y only if there is no  X  in
		 the  line.  This  really matters only if you are using -o to show the part(s) of
		 the line that matched.

       --exclude=pattern
		 When pcregrep is searching the files in a directory as a consequence of  the  -r
		 (recursive  search)  option, any regular files whose names match the pattern are
		 excluded. Subdirectories are not excluded by  this  option;  they  are  searched
		 recursively, subject to the --exclude_dir and --include_dir options. The pattern
		 is a PCRE regular expression, and is matched against the final component of  the
		 file  name  (not  the	entire	path).	If a file name matches both --include and
		 --exclude, it is excluded.  There is no short form for this option.

       --exclude_dir=pattern
		 When pcregrep is searching the contents of a directory as a consequence  of  the
		 -r  (recursive  search) option, any subdirectories whose names match the pattern
		 are excluded. (Note that the --exclude option does not  affect  subdirectories.)
		 The  pattern is a PCRE regular expression, and is matched against the final com-
		 ponent of the name (not the entire path). If a subdirectory  name  matches  both
		 --include_dir and --exclude_dir, it is excluded. There is no short form for this
		 option.

       -F, --fixed-strings
		 Interpret each pattern as a  list  of	fixed  strings,  separated  by	newlines,
		 instead of as a regular expression. The -w (match as a word) and -x (match whole
		 line) options can be used with -F. They apply to each of the  fixed  strings.	A
		 line  is  selected if any of the fixed strings are found in it (subject to -w or
		 -x, if present).

       -f filename, --file=filename
		 Read a number of patterns from the file, one per line, and  match  them  against
		 each  line  of input. A data line is output if any of the patterns match it. The
		 filename can be given as "-" to refer to the standard input. When  -f	is  used,
		 patterns  specified  on  the command line using -e may also be present; they are
		 tested before the file's patterns. However, no other pattern is taken	from  the
		 command line; all arguments are treated as file names. There is an overall maxi-
		 mum of 100 patterns. Trailing white space is removed from each line,  and  blank
		 lines	are  ignored.  An  empty  file contains no patterns and therefore matches
		 nothing. See also the comments about multiple patterns versus a  single  pattern
		 with alternatives in the description of -e above.

       --file-offsets
		 Instead  of  showing  lines  or parts of lines that match, show each match as an
		 offset from the start of the file and a length, separated by a  comma.  In  this
		 mode,	no  context is shown. That is, the -A, -B, and -C options are ignored. If
		 there is more than one match in a line, each of them is shown	separately.  This
		 option is mutually exclusive with --line-offsets and --only-matching.

       -H, --with-filename
		 Force	the inclusion of the filename at the start of output lines when searching
		 a single file. By default, the filename is not shown in this case. For  matching
		 lines,  the filename is followed by a colon; for context lines, a hyphen separa-
		 tor is used. If a line number is also being output, it follows the file name.

       -h, --no-filename
		 Suppress the output filenames when searching multiple files. By  default,  file-
		 names	are shown when multiple files are searched. For matching lines, the file-
		 name is followed by a colon; for context lines, a hyphen separator is used.   If
		 a line number is also being output, it follows the file name.

       --help	 Output a help message, giving brief details of the command options and file type
		 support, and then exit.

       -i, --ignore-case
		 Ignore upper/lower case distinctions during comparisons.

       --include=pattern
		 When pcregrep is searching the files in a directory as a consequence of  the  -r
		 (recursive  search)  option, only those regular files whose names match the pat-
		 tern are included. Subdirectories are always included and searched  recursively,
		 subject  to  the  --include_dir and --exclude_dir options. The pattern is a PCRE
		 regular expression, and is matched against the final component of the file  name
		 (not  the  entire path). If a file name matches both --include and --exclude, it
		 is excluded. There is no short form for this option.

       --include_dir=pattern
		 When pcregrep is searching the contents of a directory as a consequence  of  the
		 -r  (recursive  search)  option, only those subdirectories whose names match the
		 pattern are included. (Note that the --include option does not affect	subdirec-
		 tories.)  The	pattern  is a PCRE regular expression, and is matched against the
		 final component of the name (not  the	entire	path).	If  a  subdirectory  name
		 matches  both --include_dir and --exclude_dir, it is excluded. There is no short
		 form for this option.

       -L, --files-without-match
		 Instead of outputting lines from the files, just output the names of  the  files
		 that  do  not	contain  any lines that would have been output. Each file name is
		 output once, on a separate line.

       -l, --files-with-matches
		 Instead of outputting lines from the files, just output the names of  the  files
		 containing  lines that would have been output. Each file name is output once, on
		 a separate line. Searching stops as soon as a matching line is found in a file.

       --label=name
		 This option supplies a name to be used for the standard input	when  file  names
		 are being output. If not supplied, "(standard input)" is used. There is no short
		 form for this option.

       --line-offsets
		 Instead of showing lines or parts of lines that match, show each match as a line
		 number,  the offset from the start of the line, and a length. The line number is
		 terminated by a colon (as usual; see the -n option), and the offset  and  length
		 are  separated  by a comma. In this mode, no context is shown.  That is, the -A,
		 -B, and -C options are ignored. If there is more than one match in a line,  each
		 of  them is shown separately. This option is mutually exclusive with --file-off-
		 sets and --only-matching.

       --locale=locale-name
		 This option specifies a locale to be used for pattern matching. It overrides the
		 value	in  the  LC_ALL or LC_CTYPE environment variables. If no locale is speci-
		 fied, the PCRE library's default (usually the "C" locale) is used. There  is  no
		 short form for this option.

       -M, --multiline
		 Allow	patterns to match more than one line. When this option is given, patterns
		 may usefully contain literal newline characters and internal  occurrences  of	^
		 and  $  characters.  The  output  for any one match may consist of more than one
		 line. When this option is set, the PCRE library is called in  "multiline"  mode.
		 There	is a limit to the number of lines that can be matched, imposed by the way
		 that pcregrep buffers the input file as it scans it. However,	pcregrep  ensures
		 that  at  least  8K  characters  or  the  rest of the document (whichever is the
		 shorter) are available for forward matching, and similarly the previous 8K char-
		 acters  (or  all the previous characters, if fewer than 8K) are guaranteed to be
		 available for lookbehind assertions.

       -N newline-type, --newline=newline-type
		 The PCRE library supports five different conventions for indicating the ends  of
		 lines.  They  are  the  single-character  sequences  CR (carriage return) and LF
		 (linefeed), the two-character sequence CRLF, an "anycrlf" convention, which rec-
		 ognizes  any of the preceding three types, and an "any" convention, in which any
		 Unicode line ending sequence is assumed to end a line. The Unicode sequences are
		 the three just mentioned, plus VT (vertical tab, U+000B), FF (formfeed, U+000C),
		 NEL (next line, U+0085), LS (line separator, U+2028), and PS (paragraph  separa-
		 tor, U+2029).

		 When  the  PCRE  library  is built, a default line-ending sequence is specified.
		 This is normally the standard sequence for the operating system.  Unless  other-
		 wise  specified by this option, pcregrep uses the library's default.  The possi-
		 ble values for this option are CR, LF, CRLF, ANYCRLF, or ANY. This makes it pos-
		 sible	to  use  pcregrep on files that have come from other environments without
		 having to modify their line endings. If the data that is being scanned does  not
		 agree	with  the  convention  set by this option, pcregrep may behave in strange
		 ways.

       -n, --line-number
		 Precede each output line by its line number in the file, followed by a colon for
		 matching lines or a hyphen for context lines. If the filename is also being out-
		 put, it precedes the line number. This option is  forced  if  --line-offsets  is
		 used.

       -o, --only-matching
		 Show  only the part of the line that matched a pattern. In this mode, no context
		 is shown. That is, the -A, -B, and -C options are ignored. If there is more than
		 one match in a line, each of them is shown separately. If -o is combined with -v
		 (invert the sense of the match to find non-matching lines), no output is  gener-
		 ated,	but  the return code is set appropriately. This option is mutually exclu-
		 sive with --file-offsets and --line-offsets.

       -q, --quiet
		 Work quietly, that is, display nothing except error messages.	The  exit  status
		 indicates whether or not any matches were found.

       -r, --recursive
		 If any given path is a directory, recursively scan the files it contains, taking
		 note of any --include and --exclude settings. By default, a directory is read as
		 a  normal  file;  in some operating systems this gives an immediate end-of-file.
		 This option is a shorthand for setting the -d option to "recurse".

       -s, --no-messages
		 Suppress error messages about non-existent or unreadable files. Such  files  are
		 quietly skipped. However, the return code is still 2, even if matches were found
		 in other files.

       -u, --utf-8
		 Operate in UTF-8 mode. This option is available only if PCRE has  been  compiled
		 with  UTF-8  support.	Both  patterns and subject lines must be valid strings of
		 UTF-8 characters.

       -V, --version
		 Write the version numbers of pcregrep and the PCRE library that is being used to
		 the standard error stream.

       -v, --invert-match
		 Invert  the sense of the match, so that lines which do not match any of the pat-
		 terns are the ones that are found.

       -w, --word-regex, --word-regexp
		 Force the patterns to match only whole words. This is equivalent to having \b at
		 the start and end of the pattern.

       -x, --line-regex, --line-regexp
		 Force	the patterns to be anchored (each must start matching at the beginning of
		 a line) and in addition, require them to match entire lines. This is  equivalent
		 to  having ^ and $ characters at the start and end of each alternative branch in
		 every pattern.

ENVIRONMENT VARIABLES

       The environment variables LC_ALL and LC_CTYPE are examined, in that order, for  a  locale.
       The  first  one	that is set is used. This can be overridden by the --locale option. If no
       locale is set, the PCRE library's default (usually the "C" locale) is used.

NEWLINES

       The -N (--newline) option allows pcregrep to scan files with different newline conventions
       from  the  default.  However,  the setting of this option does not affect the way in which
       pcregrep writes information to the standard error and output streams. It uses  the  string
       "\n"  in  C  printf()  calls to indicate newlines, relying on the C I/O library to convert
       this to an appropriate sequence if the output is sent to a file.

OPTIONS COMPATIBILITY

       The majority of short and long forms of pcregrep's options are the same as in the GNU grep
       program.  Any  long option of the form --xxx-regexp (GNU terminology) is also available as
       --xxx-regex (PCRE terminology). However, the --locale, -M, --multiline,	-u,  and  --utf-8
       options are specific to pcregrep.

OPTIONS WITH DATA

       There  are  four different ways in which an option with data can be specified.  If a short
       form option is used, the data may follow immediately, or in the next  command  line  item.
       For example:

	 -f/some/file
	 -f /some/file

       If  a  long  form option is used, the data may appear in the same command line item, sepa-
       rated by an equals character, or (with one exception) it may appear in  the  next  command
       line item. For example:

	 --file=/some/file
	 --file /some/file

       Note,  however, that if you want to supply a file name beginning with ~ as data in a shell
       command, and have the shell expand ~ to a home directory, you must separate the file  name
       from the option, because the shell does not treat ~ specially unless it is at the start of
       an item.

       The exception to the above is the --colour (or --color) option,	for  which  the  data  is
       optional.  If  this  option  does  have data, it must be given in the first form, using an
       equals character. Otherwise it will be assumed that it has no data.

MATCHING ERRORS

       It is possible to supply a regular expression that takes a very long time to fail to match
       certain	lines.	Such  patterns	normally  involve nested indefinite repeats, for example:
       (a+)*\d when matched against a line of a's with no final digit. The PCRE matching function
       has  a  resource  limit	that  causes it to abort in these circumstances. If this happens,
       pcregrep outputs an error message and the line that caused the  problem	to  the  standard
       error stream. If there are more than 20 such errors, pcregrep gives up.

DIAGNOSTICS

       Exit  status  is 0 if any matches were found, 1 if no matches were found, and 2 for syntax
       errors and non-existent or inacessible files (even if matches were found in  other  files)
       or  too	many  matching errors. Using the -s option to suppress error messages about inac-
       cessble files does not affect the return code.

SEE ALSO

       pcrepattern(3), pcretest(1).

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

REVISION

       Last updated: 01 March 2009
       Copyright (c) 1997-2009 University of Cambridge.

										      PCREGREP(1)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums


All times are GMT -4. The time now is 06:20 AM.