Unix/Linux Go Back    


CentOS 7.0 - man page for pmie (centos section 1)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)


PMIE(1) 										  PMIE(1)

NAME
       pmie - inference engine for performance metrics

SYNOPSIS
       pmie  [-bCdefHVvWxz]  [-A  align]  [-a  archive]  [-c filename] [-h host] [-l logfile] [-j
       stompfile] [-n pmnsfile] [-O offset] [-S starttime] [-T endtime] [-t interval]  [-U  user-
       name] [-Z timezone] [filename ...]

DESCRIPTION
       pmie  accepts a collection of arithmetic, logical, and rule expressions to be evaluated at
       specified frequencies.  The base data for the expressions consists of performance  metrics
       values  delivered  in  real-time  from any host running the Performance Metrics Collection
       Daemon (PMCD), or using historical data from Performance Co-Pilot (PCP) archive logs.

       As well as computing arithmetic and  logical  values,  pmie  can  execute  actions  (popup
       alarms,	write  system  log messages, and launch programs) in response to specified condi-
       tions.  Such actions are extremely useful in detecting, monitoring and correcting  perfor-
       mance related problems.

       The expressions to be evaluated are read from configuration files specified by one or more
       filename arguments.  In the absence of any filename, expressions are  read  from  standard
       input.

       A description of the command line options specific to pmie follows:

       -a   archive  is  the  base  name  of  a PCP archive log written by pmlogger(1).  Multiple
	    instances of the -a flag may appear on the command line to specify a set of archives.
	    In	this  case,  it  is  required  that only one archive be present for any one host.
	    Also, any explicit host names occurring in a pmie expression must match the host name
	    recorded  in one of the archive labels.  In the case of multiple archives, timestamps
	    recorded in the archives are used to ensure temporal consistency.

       -b   Output will be line buffered and standard output is attached to standard error.  This
	    is	most  useful  for background execution in conjunction with the -l option.  The -b
	    option is always used for pmie instances launched from pmie_check(1).

       -C   Parse the configuration file(s) and exit  before  performing  any  evaluations.   Any
	    errors in the configuration file are reported.

       -c   An alternative to specifying filename at the end of the command line.

       -d   Normally  pmie  would  be launched as a non-interactive process to monitor and manage
	    the performance of one or more hosts.  Given the -d flag however, execution is inter-
	    active  and the user is presented with a menu of options.  Interactive mode is useful
	    mainly for debugging new expressions.

       -e   When used with -V, -v or -W, this option forces timestamps to be reported  with  each
	    expression.   The  timestamps  are	in  ctime(3)  format, enclosed in parenthesis and
	    appear after the expression name and before the expression value, e.g.
		 expr_1 (Tue Feb  6 19:55:10 2001): 12

       -f   If the -l option is specified and there is no -a option  (ie.  real-time  monitoring)
	    then  pmie is run as a daemon in the background (in all other cases foreground is the
	    default).  The -f option forces pmie to be run in the foreground, independent of  any
	    other options.

       -H   The  default  hostname written to the stats file will not be looked up via gethostby-
	    name(3), rather it will be written as-is.  This option can be useful when  host  name
	    aliases  are in use at a site, and the logical name is more important than the physi-
	    cal host name.

       -h   By default performance data is fetched from the local host (in real-time mode) or the
	    host  for  the  first  named archive on the command line (in archive mode).  The host
	    argument overrides this default.  It does not override hosts explicitly named in  the
	    expressions being evaluated.

       -l   Standard error is sent to logfile.

       -j   An alternative STOMP protocol configuration is loaded from stompfile.  If this option
	    is not used, and the  stomp  action  is  used  in  any  rule,  the	default  location
	    $PCP_SYSCONF_DIR/pmie/config/stomp will be used.

       -n   An	alternative  Performance  Metrics Name Space (PMNS) is loaded from the file pmns-
	    file.

       -t   The interval argument follows the syntax described in PCPIntro(1), and  in	the  sim-
	    plest  form  may be an unsigned integer (the implied units in this case are seconds).
	    The value is used to determine the	sample	interval  for  expressions  that  do  not
	    explicitly	set  their sample interval using the pmie variable delta described below.
	    The default is 10.0 seconds.

       -U username
	    User account under which to run pmie.  The default is the current  user  account  for
	    interactive  use.	When  run  as a daemon, the unprivileged "pcp" account is used in
	    current versions of PCP, but in older versions the	superuser  account  ("root")  was
	    used by default.

       -v   Unless  one  of the verbose options -V, -v or -W appears on the command line, expres-
	    sions are evaluated silently, the only output is as a result  of  any  actions  being
	    executed.	In  the  verbose  mode,  specified  using  the -v flag, the value of each
	    expression is printed as it is evaluated.  The values are in canonical  units;  bytes
	    in the dimension of ``space'', seconds in the dimension of ``time'' and events in the
	    dimension of ``count''.  See pmLookupDesc(3) for details of the  supported	dimension
	    and  scaling mechanisms for performance metrics.  The verbose mode is useful in moni-
	    toring the value of given expressions, evaluating derived performance metrics,  pass-
	    ing  these	values	on  to	other  tools  for further processing and in debugging new
	    expressions.

       -V   This option has the same effect as the -v option, except that the name  of	the  host
	    and instance (if applicable) are printed as well as expression values.

       -W   This  option  has  the  same effect as the -V option described above, except that for
	    boolean expressions, only those names and values that make the  expression	true  are
	    printed.   These  are the same names and values accessible to rule actions as the %h,
	    %i and %v bindings, as described below.

       -x   Execute in domain agent mode.  This mode is  used  within  the  Performance  Co-Pilot
	    product  to  derive  values for summary metrics, see pmdasummary(1).  Only restricted
	    functionality is available in this mode (expressions with actions may not be used).

       -Z   Change the reporting timezone to timezone in the format of the  environment  variable
	    TZ as described in environ(5).

       -z   Change  the  reporting timezone to the timezone of the host that is the source of the
	    performance metrics, as identified via either the -h option or the	first  named  ar-
	    chive (as described above for the -a option).

       The -S, -T, -O, and -A options may be used to define a time window to restrict the samples
       retrieved, set an initial origin within the time window, or specify a  ``natural''  align-
       ment  of  the  sample  times;  refer  to  PCPIntro(1)  for a complete description of these
       options.

       Output from pmie is directed to standard output and standard error as follows:

       stdout
	    Expression values printed in the verbose -v mode and the output of print actions.

       stderr
	    Error and warning messages for any syntactic or semantic problems  during  expression
	    parsing, and any semantic or performance metrics availability problems during expres-
	    sion evaluation.

EXAMPLES
       The following example expressions demonstrate some of the capabilities  of  the	inference
       engine.

       The  directory  $PCP_DEMOS_DIR/pmie  contains a number of other annotated examples of pmie
       expressions.

       The variable delta controls expression  evaluation  frequency.	Specify  that  subsequent
       expressions be evaluated once a second, until further notice:

	    delta = 1 sec;

       If total syscall rate exceeds 5000 per second per CPU, then display an alarm notifier:

	    kernel.all.syscall / hinv.ncpu > 5000 count/sec
	    -> alarm "high syscall rate";

       If the high syscall rate is sustained for 10 consecutive samples, then launch top(1) in an
       xwsh(1G) window to monitor processes, but do this at most once every 5 minutes:

	    all_sample (
		kernel.all.syscall @0..9 > 5000 count/sec * hinv.ncpu
	    ) -> shell 5 min "xwsh -e 'top'";

       The following rules are evaluated once every 20 seconds:

	    delta = 20 sec;

       If any disk is performing more than 60 I/Os per second, then print a  message  identifying
       the busy disk to standard output and launch dkvis(1):

	    some_inst (
		disk.dev.total > 60 count/sec
	    ) -> print "disk %i busy " &
		 shell 5 min "dkvis";

       Refine the preceding rule to apply only between the hours of 9am and 5pm, and to require 3
       of 4 consecutive samples to exceed the threshold before executing the action:

	    $hour >= 9 && $hour <= 17 &&
	    some_inst (
	      75 %_sample (
		disk.dev.total @0..3 > 60 count/sec
	      )
	    ) -> print "disk %i busy ";

       The following rules are evaluated once every 10 minutes:

	    delta = 10 min;

       If either the / or the /usr filesystem is more than 95% full, display an alarm popup,  but
       not if it has already been displayed during the last 4 hours:

	    filesys.free #'/dev/root' /
		filesys.capacity #'/dev/root' < 0.05
	    -> alarm 4 hour "root filesystem (almost) full";

	    filesys.free #'/dev/usr' /
		filesys.capacity #'/dev/usr' < 0.05
	    -> alarm 4 hour "/usr filesystem (almost) full";

       The  following  rule requires a machine that supports the PCP environment metrics.  If the
       machine environment temperature rises more than 2 degrees over a 10 minute interval, write
       an entry in the system log:

	    environ.temp @0 - environ.temp @1 > 2
	    -> alarm "temperature rising fast" &
	       syslog "machine room temperature rise alarm";

       And  last,  something  interesting if you have performance problems with your Oracle data-
       base:

	    db = "oracle.ptg1";
	    host = ":moomba.melbourne.sgi.com";
	    lru = "#'cache buffers lru chain'";
	    gets = "$db.latch.gets $host $lru";
	    total = "$db.latch.gets $host $lru +
		     $db.latch.misses $host $lru +
		     $db.latch.immisses $host $lru";

	    $total > 100 && $gets / $total < 0.2
	    -> alarm "high lru latch contention";

QUICK START
       The pmie specification language is powerful and large.

       To expedite rapid development of pmie rules, the pmieconf(1) tool provides a facility  for
       generating  a  pmie configuration file from a set of generalized pmie rules.  The supplied
       set of rules covers a wide range of performance scenarios.

       The Performance Co-Pilot User's and Administrator's Guide provides  a  detailed	tutorial-
       style chapter covering pmie.

EXPRESSION SYNTAX
       This description is terse and informal.	For a more comprehensive description see the Per-
       formance Co-Pilot User's and Administrator's Guide.

       A pmie specification is a sequence of semicolon terminated expressions.

       Basic operators are modeled on the arithmetic, relational and Boolean operators of  the	C
       programming  language.	Precedence rules are as expected, although the use of parentheses
       is encouraged to enhance readability and remove ambiguity.

       Operands are performance metric names (see pmns(5)) and the normal literal constants.

       Operands involving performance metrics may produce sets of values, as a result of enumera-
       tion  in the dimensions of hosts, instances and time.  Special qualifiers may appear after
       a performance metric name to define the enumeration in each dimension.  For example,

	   kernel.percpu.cpu.user :foo :bar #cpu0 @0..2

       defines 6 values corresponding to the time spent executing in user mode on CPU  0  on  the
       hosts ``foo'' and ``bar'' over the last 3 consecutive samples.  The default interpretation
       in the absence of : (host), # (instance) and @ (time) qualifiers is all instances  at  the
       most recent sample time for the default source of PCP performance metrics.

       Host  and  instance  names  that do not follow the rules for variables in programming lan-
       guages, ie. alphabetic optionally followed by alphanumerics, should be enclosed in  single
       quotes.

       Expression  evaluation  follows the law of ``least surprises''.	Where performance metrics
       have the semantics of a counter, pmie will automatically convert to a rate based upon con-
       secutive  samples and the time interval between these samples.  All expressions are evalu-
       ated in double precision, and where appropriate, automatically scaled into canonical units
       of ``bytes'', ``seconds'' and ``counts''.

       A rule is a special form of expression that specifies a condition or logical expression, a
       special operator (->) and actions to be performed when the condition is found to be true.

       The following table summarizes the basic pmie operators:

		     +----------------+--------------------------------------------+
		     |	 Operators    | 	       Explanation		   |
		     +----------------+--------------------------------------------+
		     |+ - * /	      | Arithmetic				   |
		     |< <= == >= > != | Relational (value comparison)		   |
		     |! && ||	      | Boolean 				   |
		     |->	      | Rule					   |
		     |rising	      | Boolean, false to true transition	   |
		     |falling	      | Boolean, true to false transition	   |
		     |rate	      | Explicit rate conversion (rarely required) |
		     +----------------+--------------------------------------------+
       Aggregate operators may be used to aggregate or summarize along one dimension  of  a  set-
       valued  expression.   The following aggregate operators map from a logical expression to a
       logical expression of lower dimension.

		  +-------------------------+-------------+--------------------------+
		  |	  Operators	    |	 Type	  |	  Explanation	     |
		  +-------------------------+-------------+--------------------------+
		  |some_inst		    | Existential | True if at least one set |
		  |some_host		    |		  | member is true in the    |
		  |some_sample		    |		  | associated dimension     |
		  +-------------------------+-------------+--------------------------+
		  |all_inst		    | Universal   | True if all set members  |
		  |all_host		    |		  | are true in the associ-  |
		  |all_sample		    |		  | ated dimension	     |
		  +-------------------------+-------------+--------------------------+
		  |N%_inst		    | Percentile  | True if at least N per-  |
		  |N%_host		    |		  | cent of set members are  |
		  |N%_sample		    |		  | true in the associated   |
		  |			    |		  | dimension		     |
		  +-------------------------+-------------+--------------------------+
       The  following  instantial  operators  may be used to filter or limit a set-valued logical
       expression, based on regular expression matching of instance names.  The  logical  expres-
       sion  must be a set involving the dimension of instances, and the regular expression is of
       the form used by egrep(1) or the Extended Regular Expressions of regcomp(3G).

		       +-------------+------------------------------------------+
		       | Operators   |		     Explanation		|
		       +-------------+------------------------------------------+
		       |match_inst   | For each value of the logical expression |
		       |	     | that is ``true'', the result is ``true'' |
		       |	     | if the associated instance name matches	|
		       |	     | the regular expression.	Otherwise the	|
		       |	     | result is ``false''.			|
		       +-------------+------------------------------------------+
		       |nomatch_inst | For each value of the logical expression |
		       |	     | that is ``true'', the result is ``true'' |
		       |	     | if the associated instance name does not |
		       |	     | match the regular expression.  Otherwise |
		       |	     | the result is ``false''. 		|
		       +-------------+------------------------------------------+
       For example, the expression below will be ``true'' for disks attached to controllers 2  or
       3 performing more than 20 operations per second:
	    match_inst "^dks[23]d" disk.dev.total > 20;

       The  following  aggregate  operators  map  from	an arithmetic expression to an arithmetic
       expression of lower dimension.

		   +-------------------------+-----------+--------------------------+
		   |	   Operators	     |	 Type	 |	 Explanation	    |
		   +-------------------------+-----------+--------------------------+
		   |min_inst		     | Extrema	 | Minimum value across all |
		   |min_host		     |		 | set members in the asso- |
		   |min_sample		     |		 | ciated dimension	    |
		   +-------------------------+-----------+--------------------------+
		   |max_inst		     | Extrema	 | Maximum value across all |
		   |max_host		     |		 | set members in the asso- |
		   |max_sample		     |		 | ciated dimension	    |
		   +-------------------------+-----------+--------------------------+
		   |sum_inst		     | Aggregate | Sum of values across all |
		   |sum_host		     |		 | set members in the asso- |
		   |sum_sample		     |		 | ciated dimension	    |
		   +-------------------------+-----------+--------------------------+
		   |avg_inst		     | Aggregate | Average value across all |
		   |avg_host		     |		 | set members in the asso- |
		   |avg_sample		     |		 | ciated dimension	    |
		   +-------------------------+-----------+--------------------------+
       The aggregate operators count_inst, count_host and count_sample map from a logical expres-
       sion  to an arithmetic expression of lower dimension by counting the number of set members
       for which the expression is true in the associated dimension.

       For action rules, the following actions are defined:

			  +----------+----------------------------------------+
			  |Operators |		    Explanation 	      |
			  +----------+----------------------------------------+
			  |alarm     | Raise a visible alarm with xconfirm(1) |
			  |print     | Display on standard output	      |
			  |shell     | Execute with sh(1)		      |
			  |stomp     | Send a STOMP message to a JMS server   |
			  |syslog    | Append a message to system log file    |
			  +----------+----------------------------------------+
       Multiple actions may be separated by the & and | operators to specify respectively sequen-
       tial execution (both actions are executed) and alternate execution (the second action will
       only be executed if the execution of the first action returns a non-zero error status.

       Arguments to actions are an optional suppression time, and then one or more expressions (a
       string is an expression in this context).  Strings appearing as arguments to an action may
       include the following special selectors that will be replaced at the time  the  action  is
       executed.

       %h  Host(s) that make the left-most top-level expression in the condition true.

       %i  Instance(s) that make the left-most top-level expression in the condition true.

       %v  One	value  from the left-most top-level expression in the condition for each host and
	   instance pair that makes the condition true.

       Note that expansion of the special selectors is done by repeating the whole argument  once
       for each unique binding to any of the qualifying special selectors.  For example if a rule
       were true for the host mumble with instances grunt and snort,  and  for	host  fumble  the
       instance puff makes the rule true, then the action
	    ...
	    -> shell myscript "Warning: %h:%i busy ";
       will  execute  myscript with the argument string "Warning: mumble:grunt busy Warning: mum-
       ble:snort busy Warning: fumble:puff busy".

       By comparison, if the action
	    ...
	    -> shell myscript "Warning! busy:" " %h:%i";
       were executed under the same circumstances, then myscript would be executed with the argu-
       ment string "Warning! busy: mumble:grunt mumble:snort fumble:puff".

       The semantics of the expansion of the special selectors leads to a common usage pattern in
       an action, where one argument is a constant (contains no  special  selectors)  the  second
       argument  contains the desired special selectors with minimal separator characters, and an
       optional third argument provides a constant postscript (e.g.  to  terminate  any  argument
       quoting from the first argument).  If necessary post-processing (eg. in myscript) can pro-
       vide the necessary enumeration over each unique expansion of the  string  containing  just
       the special selectors.

       For  complex  conditions,  the bindings to these selectors is not obvious.  It is strongly
       recommended that pmie be used in the debugging mode (specify the -W command line option in
       particular) during rule development.

SCALE FACTORS
       Scale  factors  may  be appended to arithmetic expressions and force linear scaling of the
       value to canonical units.   Simple  scale  factors  are	constructed  from  the	keywords:
       nanosecond,  nanosec, nsec, microsecond, microsec, usec, millisecond, millisec, msec, sec-
       ond, sec, minute, min, hour, byte, Kbyte, Mbyte, Gbyte, Tbyte, count, Kcount  and  Mcount,
       and the operator /, for example ``Kbytes / hour''.

MACROS
       Macros are defined using expressions of the form:

	    name = constexpr;

       Where name follows the normal rules for variables in programming languages, ie. alphabetic
       optionally followed by alphanumerics.  constexpr must be a constant expression,	either	a
       string  (enclosed  in  double quotes) or an arithmetic expression optionally followed by a
       scale factor.

       Macros are expanded when their name, prefixed by a dollar ($) appears  in  an  expression,
       and macros may be nested within a constexpr string.

       The following reserved macro names are understood.

       minute	 Current minute of the hour.

       hour	 Current hour of the day, in the range 0 to 23.

       day	 Current day of the month, in the range 1 to 31.

       month	 Current month of the year, in the range 0 (January) to 11 (December).

       year	 Current year.

       day_of_week
		 Current day of the week, in the range 0 (Sunday) to 6 (Saturday).

       delta	 Sample interval in effect for this expression.

       Dates  and  times  are  presented in the reporting time zone (see description of -Z and -z
       command line options above).

AUTOMATIC RESTART
       It is often useful for pmie processes to be started and stopped when  the  local  host  is
       booted  or  shutdown, or when they have been detected as no longer running (when they have
       unexpectedly exited for some reason).  Refer to pmie_check(1) for  details  on  automating
       this process.

EVENT MONITORING
       It  is common for production systems to be monitored in a central location.  Traditionally
       on UNIX systems this has been performed by the system log facilities - see logger(1),  and
       syslogd(1).   On Windows, communication with the system event log is handled by pcp-event-
       log(1).

       pmie fits into this model when rules use the syslog  action.   Note  that  if  the  action
       string  begins with -p (priority) and/or -t (tag) then these are extracted from the string
       and treated in the same way as in logger(1) and pcp-eventlog(1).

       However, it is common to have other event monitoring frameworks also, into which  you  may
       wish  to  incorporate performance events from pmie.  You can often use the shell action to
       send events to these frameworks, as they usually provide their  a  program  for	injecting
       events into the framework from external sources.

       A  final  option  is use of the stomp (Streaming Text Oriented Messaging Protocol) action,
       which allows pmie to connect to a central JMS (Java  Messaging  System)	server	and  send
       events to the PMIE topic.  Tools can be written to extract these text messages and present
       them to operations people (via desktop popup windows,  etc).   Use  of  the  stomp  action
       requires  a  stomp configuration file to be setup, which specifies the location of the JMS
       server host, port number, and username/password.

       The format of this file is as follows:

	    host=messages.sgi.com   # this is the JMS server (required)
	    port=61616		    # and its listening here (required)
	    timeout=2		    # seconds to wait for server (optional)
	    username=joe	    # (required)
	    password=j03ST0MP	    # (required)
	    topic=PMIE		    # JMS topic for pmie messages (optional)

       The timeout value specifies the time (in seconds) that pmie should wait	for  acknowledge-
       ments  from  the  JMS  server after sending a message (as required by the STOMP protocol).
       Note that on startup, pmie will wait indefinitely for a connection,  and  will  not  begin
       rule evaluation until that initial connection has been established.  Should the connection
       to the JMS server be lost at any time while pmie is running, pmie will attempt  to  recon-
       nect  on  each  subsequent truthful evaluation of a rule with a stomp action, but not more
       than once per minute.  This is to avoid contributing to network congestion.  In this situ-
       ation,  where  the  STOMP  connection to the JMS server has been severed, the stomp action
       will return a non-zero error value.

FILES
       $PCP_DEMOS_DIR/pmie/*
		 annotated example rules
       $PCP_VAR_DIR/pmns/*
		 default PMNS specification files
       $PCP_TMP_DIR/pmie
		 pmie maintains files in this directory to identify the  running  pmie	instances
		 and  to  export  runtime  information	about each instance - this data forms the
		 basis of the pmcd.pmie performance metrics
       $PCP_PMIECONTROL_PATH
		 the default set of pmie instances to start at boot time - refer to pmie_check(1)
		 for details
       $PCP_SYSCONF_DIR/pmie/*
		 the  predefined alarm action scripts (email, log, popup and syslog), the example
		 action script (sample)and the concurrent action control file (control.master).

BUGS
       The lexical scanner and parser will attempt to recover after an error in the input expres-
       sions.  Parsing resumes after skipping input up to the next semi-colon (;), however during
       this skipping process the scanner is ignorant of comments  and  strings,  so  an  embedded
       semi-colon  may	cause parsing to resume at an unexpected place.  This behavior is largely
       benign, as until the initial syntax error is corrected, pmie will not attempt any  expres-
       sion evaluation.

PCP ENVIRONMENT
       Environment variables with the prefix PCP_ are used to parameterize the file and directory
       names used by PCP.  On each installation, the file /etc/pcp.conf contains the local values
       for these variables.  The $PCP_CONF variable may be used to specify an alternative config-
       uration file, as described in pcp.conf(5).

UNIX SEE ALSO
       logger(1).

WINDOWS SEE ALSO
       pcp-eventlog(1).

SEE ALSO
       PCPIntro(1), pmcd(1), pmdumplog(1), pmieconf(1),  pmie_check(1),  pminfo(1),  pmlogger(1),
       pmval(1), PMAPI(3), pcp.conf(5) and pcp.env(5).

USER GUIDE
       For  a  more  complete description of the pmie language, refer to the Performance Co-Pilot
       Users and Administrators Guide.	This is available online from:
	   http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?\
	       db=bks&fname=/SGI_Admin/books/PCP_IRIX/sgi_html/ch05.html

Performance Co-Pilot			       PCP					  PMIE(1)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums


All times are GMT -4. The time now is 09:42 AM.