Unix/Linux Go Back    


RedHat 9 (Linux i386) - man page for webalizer (redhat section 1)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)


webalizer(1)				  The Webalizer 			     webalizer(1)

NAME
       webalizer - A web server log file analysis tool.

SYNOPSIS
       webalizer [ option ... ] [ log-file ]

       webazolver [ option ... ] [ log-file ]

DESCRIPTION
       The Webalizer is a web server log file analysis program which produces usage statistics in
       HTML format for viewing with a browser.	The results are presented in  both  columnar  and
       graphical  format,  which  facilitates  interpretation.	Yearly, monthly, daily and hourly
       usage statistics are presented, along with the ability to  display  usage  by  site,  URL,
       referrer,  user	agent (browser), username, search strings, entry/exit pages,  and country
       (some information may not be available if not present in the log file being processed).

       The Webalizer supports CLF (common log format) log files, as well as Combined log  formats
       as  defined by NCSA and others, and variations of these which it attempts to handle intel-
       ligently.  In addition, the Webalizer also supports wu-ftpd xferlog formatted  log  files,
       allowing  analysis of ftp servers, and squid proxy logs.  Logs may also be compressed, via
       gzip.  If a compressed log file is detected, it will be automatically  uncompressed  while
       it is read.  Compressed logs must have the standard gzip extension of .gz.

       webazolver  is  normally  just  a symbolic link to the webalizer.  When run as webazolver,
       only DNS file creation/updates are performed, and the program  will  exit  once	complete.
       All  normal  options  and configuration directives are available, however many will not be
       used.  In addition, a DNS cache file must be specified.	If the	number	of  DNS  children
       processes to use are not specified, the webazolver will default to 5.

       This documentation applies to The Webalizer Version 2.01

RUNNING THE WEBALIZER
       The Webalizer was designed to be run from a Unix command line prompt or as a crond(8) job.
       Once executed, the general flow of the program is:

       o       A default configuration file is scanned	for.   A  file	named  webalizer.conf  is
	       searched  for  in  the current directory, and if found, it's configuration data is
	       parsed.	If  the  file  is  not	present  in  the  current  directory,	the  file
	       /etc/webalizer.conf is searched for and, if found, is used instead.

       o       Any  command line arguments given to the program are parsed.  This may include the
	       specification of a configuration file, which  is  processed  at	the  time  it  is
	       encountered.

       o       If  a  log  file was specified, it is opened and made ready for processing.  If no
	       log file was given, STDIN is used for input.  If the log filename  '-'  is  speci-
	       fied, STDIN will be forced.

       o       If  an  output directory was specified, the program does a chdir(2) to that direc-
	       tory in prepration for generating output.  If no output directory was  given,  the
	       current directory is used.

       o       If  a  non-zero	number	of  DNS  Children  processes were specified, they will be
	       started, and the specified log file will be processed, creating	or  updating  the
	       specified DNS cache file.

       o       If  no  hostname  was  given,  the  program  attempts  to get the hostname using a
	       uname(2) system call.  If that fails, localhost is used.

       o       A history file is searched for in the current  directory  (output  directory)  and
	       read  if  found.  This file keeps totals for previous months, which is used in the
	       main index.html HTML document.  Note: The file location can now be specified  with
	       the HistoryName configuration option.

       o       If incremental processing was specified, a data file is searched for and loaded if
	       found, containing the 'internal state' data of the program at the end of a  previ-
	       ous  run.   Note:  The file location can now be specified with the IncrementalName
	       configuration option.

       o       Main processing begins on the log file.	If  the  log  spans  multiple  months,	a
	       seperate HTML document is created for each month.

       o       After  main  processing,  the main index.html page is created, which has totals by
	       month and links to each months HTML document.

       o       A new history file is saved to disk, which includes totals generated by The Webal-
	       izer during the current run.

       o       If  incremental processing was specified, a data file is written that contains the
	       'internal state' data at the end of this run.

INCREMENTAL PROCESSING
       Version 1.2x of The Webalizer adds incremental run capability.  Simply  put,  this  allows
       processing  large  log files by breaking them up into smaller pieces, and processing these
       pieces instead.	What this means in real terms is that you can now rotate your  log  files
       as  often  as  you want, and still be able to produce monthly usage statistics without the
       loss of any detail.  Basically, The Webalizer saves and restores all internal  data  in	a
       file  named webalizer.current.  This allows the program to 'start where it left off' so to
       speak, and allows the preservation of detail from one run to the next.  The data  file  is
       placed  in the current output directory, and is a plain ascii text file that can be viewed
       with any standard text editor.  It's location and name may be changed using the	Incremen-
       talName configuration keyword.

       Some special precautions need to be taken when using the incremental run capability of The
       Webalizer.  Configuration options should not be changed between runs, as that could  cause
       corruption of the internal data stored.	For example, changing the MangleAgents level will
       cause different representations of user agents to be stored, producing invalid results  in
       the user agents section of the report.  If you need to change configuration options, do it
       at the end of the month after normal processing of the previous month and before  process-
       ing the current month.  You may also want to delete the webalizer.current file as well.

       The  Webalizer also attempts to prevent data duplication by keeping track of the timestamp
       of the last record processed.  This timestamp is then compared to  current  records  being
       processed, and any records that were logged previous to that timestamp are ignored.  This,
       in theory, should allow you to re-process  logs	that  have  already  been  processed,  or
       process	logs  that  contain a mix of processed/not yet processed records, and not produce
       duplication of statistics.  The only time this may break is if you  have  duplicate  time-
       stamps  in  two	seperate log files... any records in the second log file that do have the
       same timestamp as the last record in the previous log file processed, will be discarded as
       if  they  had already been processed.  There are lots of ways to prevent this however, for
       example, stopping the web server before rotating logs will prevent this	situation.   This
       setup  also  necessitates  that	you always process logs in chronological order, otherwise
       data loss will occur as a result of the timestamp compare.

REVERSE DNS LOOKUPS
       The Webalizer supports reverse DNS lookups through a DNS cache file that  is  either  cre-
       ated/updated  at run-time, or has been previously created, either by a previous run of the
       webalizer, or by running the stand-alone version, webazolver.  In order to perform reverse
       DNS  lookups,  a DNSCache filename must be specified.  In order to create/update the cache
       file at run-time, the DNSChildren number must be non-zero.  The DNSChildren  value  speci-
       fies  the  number  of  children	processes to fork, each of which will perform reverse DNS
       lookups in order to create/update the DNS cache file.  See the file DNS.README  for  addi-
       tional information.

COMMAND LINE OPTIONS
       The  Webalizer  supports  many different configuration options that will alter the way the
       program behaves and generates output.  Most of these can be specified on the command line,
       while  some  can  only  be specified in a configuration file. The command line options are
       listed below, with references to the corresponding configuration file keywords.

       General Options

       -h      Display all available command line options and exit program.

       -v -V   Display program version and exit program.

       -d      Debug.  Display debugging information for errors and warnings.

       -i      IgnoreHist.  Ignore history.  USE WITH CAUTION. This will cause The  Webalizer  to
	       ignore  any  previous monthly history file only.  Incremental data (if present) is
	       still processed.

       -p      Incremental.  Preserve internal data between runs.

       -q      Quiet.  Supress informational messages.	Does not supress warnings or errors.

       -Q      ReallyQuiet.  Supress all messages including warnings and errors.

       -T      TimeMe.	Force display of timing information at end of processing.

       -c file Use configuration file file.

       -n name HostName.  Use the hostname name.

       -o dir  OutputDir.  Use output directory dir.

       -t name ReportTitle.  Use name for report title.

       -F ( clf | ftp | squid )
	       LogType.  Specify log type to be processed.  Value can be either clf, ftp or squid
	       format.	 If not specified, will default to CLF format.	FTP logs must be in stan-
	       dard wu-ftpd xferlog format.

       -f      FoldSeqErr.  Fold out of sequence log records back into analysis, by  treating  as
	       if  they  were  the  same  date/time  as  the  last good record.  Normally, out of
	       sequence log records are simply ignored.

       -Y      CountryGraph. Supress country graph.

       -G      HourlyGraph.  Supress hourly graph.

       -x name HTMLExtension.  Defines HTML file extension to use.  If not specified, defaults to
	       html.  Do not include the leading period.

       -H      HourlyStats.  Supress hourly statistics.

       -L      GraphLegend.  Supress color coded graph legends.

       -l num  GraphLines.   Specify number of background lines. Default is 2.	Use zero ('0') to
	       disable the lines.

       -P name PageType.  Specify file extensions that are considered pages.  Sometimes  referred
	       to as pageviews.

       -m num  VisitTimeout.   Specify the Visit timeout period.  Specified in number of seconds.
	       Default is 1800 seconds (30 minutes).

       -I name IndexAlias.  Use the filename name as an additional alias for index..

       -M num  MangleAgents.  Mangle user agent names according to the mangle level specified  by
	       num.  Mangle levels are:

	       5   Browser name and major version.

	       4   Browser name, major and minor version.

	       3   Browser name, major version, minor version to two decimal places.

	       2   Browser name, major and minor versions and sub-version.

	       1   Browser name, version and machine type if possible.

	       0   All informaiton (left unchanged).

       -g num  GroupDomains.  Automatically  group sites by domain.  The grouping level specified
	       by num can be thought of as 'the number of dots' to display in the grouping.   The
	       default value of 0 disables any domain grouping.

       -D name DNSCache.  Use the DNS cache file name.

       -N num  DNSChildren.  Use num DNS children processes to perform DNS lookups, either creat-
	       ing or updateing the DNS cache file.  Specify zero (0) to disable cache file  cre-
	       ation/updates.  If given, a DNS cache filename must be specified.

       Hide Options

       -a name HideAgent.  Hide user agents matching name.

       -r name HideReferrer.  Hide referrer matching name.

       -s name HideSite.  Hide site matching name.

       -X name HideAllSites.  Hide all individual sites (only display groups).

       -u name HideURL.  Hide URL matching name.

       Table size options

       -A num  TopAgents.  Display the top num user agents table.

       -R num  TopReferrers.  Display the top num referrers table.

       -S num  TopSites.  Display the top num sites table.

       -U num  TopURLs.  Display the top num URL's table.

       -C num  TopCountries.  Display the top num countries table.

       -e num  TopEntry.  Display the top num entry pages table.

       -E num  TopExit.  Display the top num exit pages table.

CONFIGURATION FILES
       Configuration  files  are standard ascii(7) text files that may be created or edited using
       any standard editor.  Blank lines and lines  that  begin  with  a  pound  sign  ('#')  are
       ignored.   Any  other  lines  are considered to be configurgation lines, and have the form
       "Keyword Value", where the 'Keyword' is one of the currently available configuration  key-
       words  defined  below,  and 'Value' is the value to assign to that particular option.  Any
       text found after the keyword up to the end of the line is considered the keyword's  value,
       so you should not include anything after the actual value on the line that is not actually
       part of the value being assigned.  The file sample.conf	provided  with	the  distribution
       contains lots of useful documentation and examples as well.

       General Configuration Keywords

       LogFile name
	       Use log file named name.  If none specified, STDIN will be used.

       LogType name
	       Specify	log  file  type as name. Values can be either web, squid or ftp, with the
	       default being web.

       OutputDir dir
	       Create output in the directory dir.  If none specified, the current directory will
	       be used.

       HistoryName name
	       Filename  to  use  for history file.  Relative to output directory unless absolute
	       name is given (ie: starts with '/'). Defaults to 'webalizer.hist' in the  standard
	       output directory.

       ReportTitle name
	       Use  the  title	string	name  for  the	report title.  If none specified, use the
	       default of (in english) "Usage Statistics for ".

       Hostname name
	       Set the hostname for the report as name.  If none specified, an	attempt  will  be
	       made  to gather the hostname via a uname(2) system call.  If that fails, localhost
	       will be used.

       UseHTTPS ( yes | no )
	       Use https:// on links to URLS, instead of the default http://, in the 'Top  URL's'
	       table.

       Quiet ( yes | no )
	       Supress informational messages.	Warning and Error messages will not be supressed.

       ReallyQuiet ( yes | no )
	       Supress all messages, including Warning and Error messages.

       Debug ( yes | no )
	       Print extra debugging information on Warnings and Errors.

       TimeMe ( yes | no )
	       Force timing information at end of processing.

       GMTTime ( yes | no )
	       Use GMT (UTC) time instead of local timezone for reports.

       IgnoreHist ( yes | no )
	       Ignore  previous monthly history file.  USE WITH CAUTION.  Does not prevent Incre-
	       mental file processing.

       FoldSeqErr ( yes | no )
	       Fold out of sequence log records back into analysis by treating them  as  if  they
	       had  the  same  date/time  as the last good record.  Normally, out of sequence log
	       records are ignored.

       CountryGraph ( yes | no )
	       Display Country Usage Graph in output report.

       DailyGraph ( yes | no )
	       Display Daily Graph in output report.

       DailyStats ( yes | no )
	       Display Daily Statistics in output report.

       HourlyGraph ( yes | no )
	       Display Hourly Graph in output report.

       HourlyStats ( yes | no )
	       Display Hourly Statistics in output report.

       PageType name
	       Define the file extensions to consider as a page.  If a file is found to have  the
	       same  extension	as  name,  it  will  be  counted  as  a  page (sometimes called a
	       pageview).

       GraphLegend ( yes | no )
	       Allows the color coded graph legends to be enabled/disabled.

       GraphLines num
	       Specify the number of background reference lines displayed on the graphs produced.
	       Disable by using zero ('0'), default is 2.

       VisitTimeout num
	       Specifies the visit timeout value.  Default is 1800 seconds (30 minutes).  A visit
	       is determined by looking at the difference in time between the  current	and  last
	       request	from a specific site.  If the difference is greater or equal to the time-
	       out value, the request is counted as a new visit.  Specified in seconds.

       IndexAlias name
	       Use name as an additional alias for index.*.

       MangleAgents num
	       Mangle user agent names based on mangle level num.  See the -M command line switch
	       for  mangle levels and their meaning.  The default is 0, which doesn't mangle user
	       agents at all.

       SearchEngine name variable
	       Allows the specification of search engines and their query strings.  The  name  is
	       the  name  to  match  against  the referrer string for a given search engine.  The
	       variable is the cgi variable that the search engine uses  for  queries.	 See  the
	       sample.conf file for example usage with common search engines.

       Incremental ( yes | no )
	       Enable Incremental mode processing.

       IncrementalName name
	       Filename  to  use  for  incremental  data.  Relative to output directory unless an
	       absolute name is given (ie: starts with '/').  Defaults to 'webalizer.current'  in
	       the standard output directory.

       DNSCache name
	       Filename  to  use for the DNS cache.  Relative to output directory unless an abso-
	       lute name is given (ie: starts with '/').

       DNSChildren num
	       Number of children DNS processes to run in order to create/update  the  DNS  cache
	       file.  Specify zero (0) to disable.

       Top Table Keywords

       TopAgents num
	       Display the top num User Agents table. Use zero to disable.

       AllAgents ( yes | no )
	       Create seperate HTML page with All User Agents.

       TopReferrers num
	       Display the top num Referrers table. Use zero to disable.

       AllReferrers ( yes | no )
	       Create seperate HTML page with All Referrers.

       TopSites num
	       Display the top num Sites table. Use zero to disable.

       TopKSites num
	       Display the top num Sites (by KByte) table.  Use zero to disable.

       AllSites ( yes | no )
	       Create seperate HTML page with All Sites.

       TopURLs num
	       Display the top num URLs table. Use zero to disable.

       TopKURLs num
	       Display the top num URLs (by KByte) table.  Use zero to disable.

       AllURLs ( yes | no )
	       Create seperate HTML page with All URLs.

       TopCountries num
	       Display the top num Countries in the table. Use zero to disable.

       TopEntry num
	       Display the top num Entry Pages in the table.  Use zero to disable.

       TopExit num
	       Display the top num Exit Pages in the table.  Use zero to disable.

       TopSearch num
	       Display the top num Search Strings in the table.  Use zero to disable.

       AllSearchStr ( yes | no )
	       Create seperate HTML page with All Search Strings.

       TopUsers num
	       Display	the  top num Usernames in the table.  Use zero to disable.  Usernames are
	       only available if using http based authentication.

       AllUsers ( yes | no )
	       Create seperate HTML page with All Usernames.

       Hide/Ignore/Group/Include Keywords

       HideAgent name
	       Hide User Agents that match name.

       HideReferrer name
	       Hide Referrers that match name.

       HideSite name
	       Hide Sites that match name.

       HideAllSites ( yes | no )
	       Hide all individual sites.  This causes only grouped sites to be displayed.

       HideURL name
	       Hide URL's that match name.

       HideUser name
	       Hide Usernames that match name.

       IgnoreAgent name
	       Ignore User Agents that match name.

       IgnoreReferrer name
	       Ignore Referrers that match name.

       IgnoreSite name
	       Ignore Sites that match name.

       IgnoreURL name
	       Ignore URL's that match name.

       IgnoreUser name
	       Ignore Usernames that match name.

       GroupAgent name [Label]
	       Group User Agents that match name.  Display Label in 'Top Agent'  table	if  given
	       (instead of name).

       GroupReferrer name [Label]
	       Group  Referrers  that match name.  Display Label in 'Top Referrer' table if given
	       (instead of name).

       GroupSite name [Label]
	       Group Sites that match name.  Display Label in 'Top Site' table if given  (instead
	       of name).

       GroupDomains num
	       Automatically  group sites by domain.  The value num specifies the level of group-
	       ing, and can be thought of as the 'number of dots' to be displayed.   The  default
	       value of 0 disables domain grouping.

       GroupURL name [Label]
	       Group  URL's  that match name.  Display Label in 'Top URL' table if given (instead
	       of name).

       GroupUser name [Label]
	       Group Usernames that match name.  Display Label in 'Top Usernames' table if  given
	       (instead of name).

       IncludeSite name
	       Force inclusion of sites that match name.  Takes precedence over Ignore# keywords.

       IncludeURL name
	       Force inclusion of URL's that match name.  Takes precedence over Ignore# keywords.

       IncludeReferrer name
	       Force  inclusion of Referrers that match name.  Takes precedence over Ignore# key-
	       words.

       IncludeAgent name
	       Force inclusion of User Agents that match name.	 Takes	precedence  over  Ignore*
	       keywords.

       IncludeUser name
	       Force  inclusion of Usernames that match name.  Takes precedence over Ignore* key-
	       words.

       HTML Generation Keywords

       HTMLExtension text
	       Defines the HTML file extension to use.	Default is  html.   Do	not  include  the
	       leading period!

       HTMLPre text
	       Insert text at the very beginning of the generated HTML file.  Defaults to a stan-
	       dard html 3.2 DOCTYPE record.

       HTMLHead text
	       Insert text within the <HEAD></HEAD> block of the HTML file.

       HTMLBody text
	       Insert text in HTML page, starting with the <BODY> tag.	If used, the  first  line
	       must be a <BODY ...> tag.  Multiple lines may be specified.

       HTMLPost text
	       Insert  text  at  top  (before  horiz. rule) of HTML pages.  Multiple lines may be
	       specified.

       HTMLTail text
	       Insert text at bottom of the HTML page.	The text is top and right aligned  within
	       a table column at the end of the report.

       HTMLEnd text
	       Insert text at the very end of the HTML page.  If not specified, the default is to
	       insert the ending </BODY> and </HTML> tags.  If used, you must supply  these  tags
	       yourself.

       Dump Object Keywords

       The Webalizer allows you to export processed data to other programs by using tab delimited
       text files.  The Dump* commands specify which files are to be written, and where.

       DumpPath name
	       Save dump files in directory name.  If not specified, the default output directory
	       will be used.  Do not specify a trailing slash (/fP).

       DumpExtension name
	       Use  name  as the filename extension for dump files.  If not given, the default of
	       tab will be used.

       DumpHeader ( yes | no )
	       Print a column header as the first record of the file.

       DumpSites ( yes | no )
	       Dump the sites data to a tab delimited file.

       DumpURLs ( yes | no )
	       Dump the url data to a tab delimited file.

       DumpReferrers ( yes | no )
	       Dump the referrer data to a tab delimitd file.  This data  is  only  available  if
	       using a log that contains referrer information (ie: a combined format web log).

       DumpAgents ( yes | no )
	       Dump  the user agent data to a tab delimited file.  This data is only available if
	       using a log that contains user agent information (ie: a combined format web log).

       DumpUsers ( yes | no )
	       Dump the username data to a tab delimited file.	This data is  only  available  if
	       processing a wu-ftpd xferlog or a web log that contains http authentication infor-
	       mation.

       DumpSearchStr ( yes | no )
	       Dump the search string data to a tab delimited file.  This data is only	available
	       if  processing  a web log that contains referrer information and had search string
	       information present.

FILES
       webalizer.conf	   Default configuration file.	Is searched for in the current	directory
			   and if not found, in the /etc/ directory.

       webalizer.hist	   Monthly history file for previous 12 months.  (can be changed)

       webalizer.current   Current state data file (Incremental processing).  (can be changed)

       xxxxx_YYYYMM.html   Various monthly HTML output files produced. (extension can be changed)

       xxxxx_YYYYMM.png    Various monthly image files used in the reports.

       xxxxx_YYYYMM.tab    Monthly tab delimited text files.  (extension can be changed)

BUGS
       Report bugs to brad@mrunix.net.

COPYRIGHT
       Copyright  (C)  1997-2000 by Bradford L. Barrett.  Distributed under the GNU GPL.  See the
       files "COPYING" and "Copyright", supplied with all distributions for  additional  informa-
       tion.

AUTHOR
       Bradford L. Barrett <brad@mrunix.net>

Version 2.01				   22-Oct-2001				     webalizer(1)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums


All times are GMT -4. The time now is 07:08 PM.