Home Man
Search
Today's Posts
Register

Linux & Unix Commands - Search Man Pages

X11R7.4 - man page for pcrecallout (x11r4 section 3)

PCRECALLOUT(3)			     Library Functions Manual			   PCRECALLOUT(3)

NAME
       PCRE - Perl-compatible regular expressions

PCRE CALLOUTS

       int (*pcre_callout)(pcre_callout_block *);

       PCRE  provides a feature called "callout", which is a means of temporarily passing control
       to the caller of PCRE in the middle of pattern matching. The caller of  PCRE  provides  an
       external  function  by  putting	its  entry  point in the global variable pcre_callout. By
       default, this variable contains NULL, which disables all calling out.

       Within a regular expression, (?C) indicates the points at which the external  function  is
       to be called. Different callout points can be identified by putting a number less than 256
       after the letter C. The default value is zero.  For example, this pattern has two  callout
       points:

	 (?C1)abc(?C2)def

       If  the	PCRE_AUTO_CALLOUT option bit is set when pcre_compile() is called, PCRE automati-
       cally inserts callouts, all with number 255, before each item in the pattern. For example,
       if PCRE_AUTO_CALLOUT is used with the pattern

	 A(\d{2}|--)

       it is processed as if it were

       (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)

       Notice  that  there  is	a  callout before and after each parenthesis and alternation bar.
       Automatic callouts can be used for tracking the progress of pattern matching. The pcretest
       command	has an option that sets automatic callouts; when it is used, the output indicates
       how the pattern is matched. This is useful information when you are trying to optimize the
       performance of a particular pattern.

MISSING CALLOUTS

       You  should  be	aware  that, because of optimizations in the way PCRE matches patterns by
       default, callouts sometimes do not happen. For example, if the pattern is

	 ab(?C4)cd

       PCRE knows that any matching string must contain the letter "d". If the subject string  is
       "abyz",	the  lack of "d" means that matching doesn't ever start, and the callout is never
       reached. However, with "abyd", though the result is still no match, the callout is obeyed.

       You can disable these  optimizations  by  passing  the  PCRE_NO_START_OPTIMIZE  option  to
       pcre_exec() or pcre_dfa_exec(). This slows down the matching process, but does ensure that
       callouts such as the example above are obeyed.

THE CALLOUT INTERFACE

       During matching, when PCRE reaches a callout  point,  the  external  function  defined  by
       pcre_callout  is  called  (if  it  is  set).  This applies to both the pcre_exec() and the
       pcre_dfa_exec() matching functions. The only argument to the callout function is a pointer
       to a pcre_callout block. This structure contains the following fields:

	 int	      version;
	 int	      callout_number;
	 int	     *offset_vector;
	 const char  *subject;
	 int	      subject_length;
	 int	      start_match;
	 int	      current_position;
	 int	      capture_top;
	 int	      capture_last;
	 void	     *callout_data;
	 int	      pattern_position;
	 int	      next_item_length;

       The  version  field  is	an integer containing the version number of the block format. The
       initial version was 0; the current version is 1. The version number will change	again  in
       future  if  additional  fields  are added, but the intention is never to remove any of the
       existing fields.

       The callout_number field contains the number of the callout, as compiled into the  pattern
       (that  is,  the	number	after ?C for manual callouts, and 255 for automatically generated
       callouts).

       The offset_vector field is a pointer to the vector of offsets that was passed by the call-
       er  to  pcre_exec()  or	pcre_dfa_exec().  When	pcre_exec()  is used, the contents can be
       inspected in order to extract substrings that have been matched so far, in the same way as
       for  extracting	substrings after a match has completed. For pcre_dfa_exec() this field is
       not useful.

       The subject and subject_length fields contain copies of the values  that  were  passed  to
       pcre_exec().

       The start_match field normally contains the offset within the subject at which the current
       match attempt started. However, if the escape sequence \K has been encountered, this value
       is  changed  to	reflect  the modified starting point. If the pattern is not anchored, the
       callout function may be called several times from the same point in the pattern	for  dif-
       ferent starting points in the subject.

       The  current_position  field  contains  the offset within the subject of the current match
       pointer.

       When the pcre_exec() function is used, the capture_top field contains one  more	than  the
       number  of the highest numbered captured substring so far. If no substrings have been cap-
       tured, the value of capture_top is one. This is always the case	when  pcre_dfa_exec()  is
       used, because it does not support captured substrings.

       The  capture_last field contains the number of the most recently captured substring. If no
       substrings  have  been  captured,  its  value  is  -1.  This  is  always  the  case   when
       pcre_dfa_exec() is used.

       The  callout_data  field contains a value that is passed to pcre_exec() or pcre_dfa_exec()
       specifically so that it can be passed back in callouts. It is passed in	the  pcre_callout
       field  of  the  pcre_extra  data structure. If no such data was passed, the value of call-
       out_data in a pcre_callout block is NULL. There is a description of the pcre_extra  struc-
       ture in the pcreapi documentation.

       The  pattern_position  field  is  present from version 1 of the pcre_callout structure. It
       contains the offset to the next item to be matched in the pattern string.

       The next_item_length field is present from version 1 of	the  pcre_callout  structure.  It
       contains the length of the next item to be matched in the pattern string. When the callout
       immediately precedes an alternation bar, a closing parenthesis, or the end of the pattern,
       the  length  is zero. When the callout precedes an opening parenthesis, the length is that
       of the entire subpattern.

       The pattern_position and next_item_length fields are intended to  help  in  distinguishing
       between	different  automatic  callouts,  which all have the same callout number. However,
       they are set for all callouts.

RETURN VALUES

       The external callout function returns an integer to PCRE. If the value is  zero,  matching
       proceeds  as  normal.  If  the  value  is greater than zero, matching fails at the current
       point, but the testing of other matching possibilities goes ahead, just as if a	lookahead
       assertion  had  failed.	If  the  value	is  less  than	zero, the match is abandoned, and
       pcre_exec() (or pcre_dfa_exec()) returns the negative value.

       Negative values should normally be chosen from the set of PCRE_ERROR_xxx values.  In  par-
       ticular,  PCRE_ERROR_NOMATCH  forces  a	standard  "no  match"  failure.  The error number
       PCRE_ERROR_CALLOUT is reserved for use by callout functions; it will never be used by PCRE
       itself.

AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.

REVISION

       Last updated: 15 March 2009
       Copyright (c) 1997-2009 University of Cambridge.

										   PCRECALLOUT(3)


All times are GMT -4. The time now is 03:30 PM.

Unix & Linux Forums Content Copyrightę1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password