👤
Home Man
Search
Today's Posts
Register

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:
Select Section of Man Page:
Select Man Page Repository:

Linux 2.6 - man page for pcrecallout (linux section 3)

PCRECALLOUT(3)									   PCRECALLOUT(3)

NAME
       PCRE - Perl-compatible regular expressions

PCRE CALLOUTS

       int (*pcre_callout)(pcre_callout_block *);

       PCRE  provides a feature called "callout", which is a means of temporarily passing control
       to the caller of PCRE in the middle of pattern matching. The caller of  PCRE  provides  an
       external  function  by  putting	its  entry  point in the global variable pcre_callout. By
       default, this variable contains NULL, which disables all calling out.

       Within a regular expression, (?C) indicates the points at which the external  function  is
       to be called. Different callout points can be identified by putting a number less than 256
       after the letter C. The default value is zero.  For example, this pattern has two  callout
       points:

	 (?C1)abc(?C2)def

       If  the	PCRE_AUTO_CALLOUT  option  bit	is  set when pcre_compile() or pcre_compile2() is
       called, PCRE automatically inserts callouts, all with number 255, before each item in  the
       pattern. For example, if PCRE_AUTO_CALLOUT is used with the pattern

	 A(\d{2}|--)

       it is processed as if it were

       (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)

       Notice  that  there  is	a  callout before and after each parenthesis and alternation bar.
       Automatic callouts can be used for tracking the progress of pattern matching. The pcretest
       command	has an option that sets automatic callouts; when it is used, the output indicates
       how the pattern is matched. This is useful information when you are trying to optimize the
       performance of a particular pattern.

MISSING CALLOUTS

       You  should  be	aware  that, because of optimizations in the way PCRE matches patterns by
       default, callouts sometimes do not happen. For example, if the pattern is

	 ab(?C4)cd

       PCRE knows that any matching string must contain the letter "d". If the subject string  is
       "abyz",	the  lack of "d" means that matching doesn't ever start, and the callout is never
       reached. However, with "abyd", though the result is still no match, the callout is obeyed.

       If the pattern is studied, PCRE knows the minimum length of a matching  string,	and  will
       immediately  give  a  "no match" return without actually running a match if the subject is
       not long enough, or, for unanchored patterns, if it has been scanned far enough.

       You can disable these  optimizations  by  passing  the  PCRE_NO_START_OPTIMIZE  option  to
       pcre_compile(),	 pcre_exec(),  or  pcre_dfa_exec(),  or  by  starting  the  pattern  with
       (*NO_START_OPT). This slows down the matching process, but does ensure that callouts  such
       as the example above are obeyed.

THE CALLOUT INTERFACE

       During  matching,  when	PCRE  reaches  a  callout point, the external function defined by
       pcre_callout is called (if it is set). This  applies  to  both  the  pcre_exec()  and  the
       pcre_dfa_exec() matching functions. The only argument to the callout function is a pointer
       to a pcre_callout block. This structure contains the following fields:

	 int	      version;
	 int	      callout_number;
	 int	     *offset_vector;
	 const char  *subject;
	 int	      subject_length;
	 int	      start_match;
	 int	      current_position;
	 int	      capture_top;
	 int	      capture_last;
	 void	     *callout_data;
	 int	      pattern_position;
	 int	      next_item_length;

       The version field is an integer containing the version number of  the  block  format.  The
       initial	version  was 0; the current version is 1. The version number will change again in
       future if additional fields are added, but the intention is never to  remove  any  of  the
       existing fields.

       The  callout_number field contains the number of the callout, as compiled into the pattern
       (that is, the number after ?C for manual callouts, and  255  for  automatically	generated
       callouts).

       The offset_vector field is a pointer to the vector of offsets that was passed by the call-
       er to pcre_exec() or pcre_dfa_exec(). When  pcre_exec()	is  used,  the	contents  can  be
       inspected in order to extract substrings that have been matched so far, in the same way as
       for extracting substrings after a match has completed. For pcre_dfa_exec() this	field  is
       not useful.

       The  subject  and  subject_length  fields contain copies of the values that were passed to
       pcre_exec().

       The start_match field normally contains the offset within the subject at which the current
       match attempt started. However, if the escape sequence \K has been encountered, this value
       is changed to reflect the modified starting point. If the pattern  is  not  anchored,  the
       callout	function  may be called several times from the same point in the pattern for dif-
       ferent starting points in the subject.

       The current_position field contains the offset within the subject  of  the  current  match
       pointer.

       When  the  pcre_exec()  function is used, the capture_top field contains one more than the
       number of the highest numbered captured substring so far. If no substrings have been  cap-
       tured,  the  value  of capture_top is one. This is always the case when pcre_dfa_exec() is
       used, because it does not support captured substrings.

       The capture_last field contains the number of the most recently captured substring. If  no
       substrings   have  been	captured,  its	value  is  -1.	This  is  always  the  case  when
       pcre_dfa_exec() is used.

       The callout_data field contains a value that is passed to pcre_exec()  or  pcre_dfa_exec()
       specifically  so  that it can be passed back in callouts. It is passed in the pcre_callout
       field of the pcre_extra data structure. If no such data was passed,  the  value	of  call-
       out_data  in a pcre_callout block is NULL. There is a description of the pcre_extra struc-
       ture in the pcreapi documentation.

       The pattern_position field is present from version 1 of	the  pcre_callout  structure.  It
       contains the offset to the next item to be matched in the pattern string.

       The  next_item_length  field  is  present from version 1 of the pcre_callout structure. It
       contains the length of the next item to be matched in the pattern string. When the callout
       immediately precedes an alternation bar, a closing parenthesis, or the end of the pattern,
       the length is zero. When the callout precedes an opening parenthesis, the length  is  that
       of the entire subpattern.

       The  pattern_position  and  next_item_length fields are intended to help in distinguishing
       between different automatic callouts, which all have the  same  callout	number.  However,
       they are set for all callouts.

RETURN VALUES

       The  external  callout function returns an integer to PCRE. If the value is zero, matching
       proceeds as normal. If the value is greater than  zero,	matching  fails  at  the  current
       point,  but the testing of other matching possibilities goes ahead, just as if a lookahead
       assertion had failed. If the value  is  less  than  zero,  the  match  is  abandoned,  and
       pcre_exec() or pcre_dfa_exec() returns the negative value.

       Negative  values  should normally be chosen from the set of PCRE_ERROR_xxx values. In par-
       ticular, PCRE_ERROR_NOMATCH forces a  standard  "no  match"  failure.   The  error  number
       PCRE_ERROR_CALLOUT is reserved for use by callout functions; it will never be used by PCRE
       itself.

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

REVISION

       Last updated: 21 November 2010
       Copyright (c) 1997-2010 University of Cambridge.

										   PCRECALLOUT(3)


All times are GMT -4. The time now is 08:30 PM.

Unix & Linux Forums Content Copyrightę1993-2018. All Rights Reserved.
×
UNIX.COM Login
Username:
Password:  
Show Password