innfeed - multi-host, multi-connection, streaming NNTP feeder.
innfeed [ -a spool-dir ] [ -b directory ] [ -C ] [ -c filename ] [ -d num ] [ -e bytes ] [
-h ] [ -l filename ] [ -m ] [ -M ] [ -o bytes ] [ -p file ] [ -S file ] [ -x ] [ -y ] [ -z
] [ -v ] [ file ]
This man page describes version 1.0 of innfeed.
Innfeed implements the NNTP protocol for transferring news between computers. It handles
both the standard IHAVE protocol as well as the CHECK/TAKETHIS streaming extension. Inn-
feed can feed any number of remote hosts at once and will open multiple connections to
each host if configured to do so. The only limitations are the process limits for open
file descriptors and memory.
Innfeed has three modes of operation: channel, funnel-file and batch.
Channel mode is used when no filename is given on the command line, the ``input-file''
keyword is not given in the config file, and the ``-x'' option is not given. In channel
mode innfeed runs with stdin connected via a pipe to innd. Whenever innd closes this pipe
(and it has several reasons during normal processing to do so), innfeed will exit. It
first will try to finish sending all articles it was in the middle of transmitting, before
issuing a QUIT command. This means innfeed may take a while to exit depending on how slow
your peers are. It never (well, almost never) just drops the connection.
Funnel-file mode is used when a filename is given as an argument or the ``input-file''
keyword is given in the config file. In funnel file mode it reads the specified file for
the same formatted information as innd would give in channel mode. It is expected that
innd is continually writing to this file, so when innfeed reaches the end of the file it
will check periodically for new information. To prevent the funnel file from growing with-
out bounds, you will need to periodically move the file to the side (or simply remove it)
and have innd flush the file. Then, after the file is flushed by innd, you can send inn-
feed a SIGALRM, and it too will close the file and open the new file created by innd.
innfeed -p /var/run/news/innfeed.pid my-funnel-file &
while true; do
rm -f my-funnel-file
ctlinnd flush funnel-file-site
kill -ALRM `cat /var/run/news/innfeed.pid`
Batch mode is used when the ``-x'' flag is used. In batch mode innfeed will ignore stdin,
and will simply process any backlog created by a previously running innfeed. This mode is
not normally needed as innfeed will take care of backlog processing.
Innfeed expects a couple of things to be able to run correctly: a directory where it can
store backlog files and a configuration file to describe which peers it should handle.
The configuration file is described in innfeed.conf(5). The ``-c'' option can be used to
specify a different file.
For each peer (say, ``foo''), innfeed manages up to 4 files in the backlog directory: a
``foo.lock'' file, which prevents other instances of innfeed from interfering with this
one; a ``foo.input'' file which has old article information innfeed is reading for re-pro-
cessing; a ``foo.output'' file where innfeed is writing information on articles that
couldn't be processed (normally due to a slow or blocked peer); and a ``foo'' file.
This last file (``foo'') is never created by innfeed, but if innfeed notices it, it will
rename it to ``foo.input'' at the next opportunity and will start reading from it. This
lets you create a batch file and put it in a place where innfeed will find it. You should
never alter the .input or .output files of a running innfeed.
The format of these last three files is:
This is the same as the first two fields of the lines innd feeds to innfeed, and the same
as the first two fields of the lines of the batch file innd will write if innfeed is
unavailable for some reason. When innfeed processes its own batch files it ignores every-
thing after the first two whitespace separated fields, so moving the innd-created batch
file to the appropriate spot will work, even though the lines are longer.
Innfeed writes its current status to the file ``innfeed.status'' (or the file given by the
``-S'' option). This file contains details on the process as a whole, and on each peer
this instance of innfeed is managing.
If innfeed is told to send an article to a host it is not managing, then the article
information will be put into a file matching the pattern ``innfeed-dropped.*'', with part
of the file name matching the pid of the innfeed process that is writing to it. Innfeed
will not process this file except to write to it. If nothing is written to the file then
it will be removed if innfeed exits normally.
Upon receipt of a SIGALRM innfeed will close the funnel-file specified on the command
line, and will reopen it (see funnel file description above).
Innfeed with catch SIGINT and will write a large debugging snapshot of the state of the
Innfeed will catch SIGHUP and will reload the config file. See innfeed.conf(5) for more
Innfeed will catch SIGCHLD and will close and reopen all backlog files.
Innfeed will catch SIGTERM and will do an orderly shutdown.
Upon receipt of a SIGUSR1 innfeed will increment the debugging level by one, receipt of a
SIGUSR2 will decrement it by one. The debugging level starts at zero (unless the ``-d''
option it used), and no debugging information is emitted. A larger value for the level
means more debugging information. Numbers up to 5 are currently useful.
There are 3 different categories of syslog entries for statistics. Host, Connection and
The Host statistics are generated for a given peer at regular intervals after the first
connection is made (or, if the remote is unreachable, after spooling starts). The Host
statistics give totals over all Connections that have been active during the given time
frame. For example (broken here to fit the page, with ``vixie'' being the peer):
May 23 12:49:08 data innfeed: vixie checkpoint
seconds 1381 offered 2744 accepted 1286
refused 1021 rejected 437 missing 0 spooled 990
on_close 0 unspooled 240 deferred 10 requeued 25
These meanings of these fields are:
seconds The time since innfeed connected to the host or since the statistics were reset
by a ``final'' log entry.
offered The number of IHAVE commands sent to the host if it is not in streaming mode.
The sum of the number of TAKETHIS commands sent when no-CHECK mode is in effect
plus the number CHECK commands sent in streaming mode (when no-CHECK mode is not
accepted The number of articles which were sent to the remote host and accepted by it.
refused The number of articles offered to the host that it it indicated it didn't want
because it had already seen the Message-ID. The remote host indicates this by
sending a 435 response to an IHAVE command or a 438 response to a CHECK command.
rejected The number of articles transferred to the host that it did not accept because it
determined either that it already had the article or it did not want it because
of the article's Newsgroups: or Distribution: headers, etc. The remote host
indicates that it is rejecting the article by sending a 437 or 439 response
after innfeed sent the entire article.
missing The number of articles which innfeed was told to offer to the host but which
were not present in the article spool. These articles were probably cancelled
or expired before innfeed was able to offer them to the host.
spooled The number of article entries that were written to the .output backlog file
because the articles could not either be sent to the host or be refused by it.
Articles are generally spooled either because new articles are arriving more
quickly than they can be offered to the host, or because innfeed closed all the
connections to the host and pushed all the articles currently in progress to the
.output backlog file.
on_close The number of articles that were spooled when innfeed closed all the connections
to the host.
unspooled The number of article entries that were read from the .input backlog file.
deferred The number of articles that the host told innfeed to retry later by sending a
431 or 436 response. Innfeed immediately puts these articles back on the tail
of the queue.
requeued The number of articles that were in progress on connections when innfeed dropped
those connections and put the articles back on the queue. These connections may
have been broken by a network problem or became unresponsive causing innfeed to
time them out.
queue The first number is the average (mean) queue size during the previous logging
interval. The second number is the maximum allowable queue size. The third
number is the percentage of the time that the queue was empty. The fourth
through seventh numbers are the percentages of the time that the queue was >0%
to 25% full, 25% to 50% full, 50% to 75% full, and 75% to <100% full. The last
number is the percentage of the time that the queue was totally full.
If the ``-z'' option is used (see below), then when the peer stats are generated, each
Connection will log its stats too. For example, for connection number zero (from a set of
May 23 12:49:08 data innfeed: vixie:0 checkpoint
seconds 1381 offered 596 accepted 274
refused 225 rejected 97
If you only open a maximum of one Connection to a remote, then there will be a close cor-
relation between Connection numbers and Host numbers, but in general you can't tie the two
sets of number together in any easy or very meaningful way. When a Connection closes it
will always log its stats.
If all Connections for a Host get closed together, then the Host logs its stats as
``final'' and resets its counters. If the feed is so busy that there's always at least one
Connection open and running, then after some amount of time (set via the config file), the
Host stats are logged as final and reset. This is to make generating higher level stats
from log files, by other programs, easier.
There is one log entry that is emitted for a Host just after its last Connection closes
and innfeed is preparing to exit. This entry contains counts over the entire life of the
process. The ``seconds'' field is from the first time a Connection was successfully built,
or the first time spooling started. If a Host has been completely idle, it will have no
such log entry.
May 23 12:49:08 data innfeed: decwrl global
seconds 1381 offered 34 accepted 22
refused 3 rejected 7 missing 0
The final log entry is emitted immediately before exiting. It contains a summary of the
statistics over the entire life of the process.
Feb 13 14:43:41 data innfeed-0.9.4: ME global
seconds 15742 offered 273441 accepted 45750
refused 222008 rejected 3334 missing 217
-a The ``-a'' flag is used to specify the top of the article spool tree. Innfeed does
a chdir(2) to this directory, so it should probably be an absolute path. The
default is <patharticles in inn.conf>.
-b The ``-b'' flag may be used to specify a different directory for backlog file stor-
age and retrieval. If the path is relative then it is relative to <path-
spool in inn.conf>. The default is ``innfeed''.
-c The ``-c'' flag may be used to specify a different config file from the default
value. If the path is relative then it is relative to <pathetc in inn.conf>. The
default is ``innfeed.conf''.
-C The ``-C'' flag is used to have innfeed simply check the config file, report on any
errors and then exit.
-d The ``-d'' flag may be used to specify the initial logging level. All debugging
messages to to stderr (see the ``-l'' flag below.
-e The ``-e'' flag may be used to specify the size limit (in bytes) for the .output
backlog files innfeed creates. If the output file gets bigger than 10% more than
the given number, innfeed will replace the output file with the tail of the origi-
nal version. The default value is 0, which means there is no limit.
-h Use the ``-h'' flag to print the usage message.
-l The ``-l'' flag may be used to specify a different log file from stderr. As innd
starts innfeed with stderr attached to /dev/null using this option can be useful in
catching any abnormal error messages, or andy debugging messages (all ``normal''
errors messages go to syslog).
-M If innfeed has been built with mmap support, then the ``-M'' flag turns OFF the use
of mmap(), otherwise it has no effect.
-m The ``-m'' flag is used to turn on logging of all missing articles. Normally if an
article is missing, innfeed keeps a count, but logs no further information. When
this flag is used, details about message-id and expected pathname are logged.
-o The ``-o'' flag sets a value of the maximum number of bytes of article data innfeed
is supposed to keep in memory. This doesn't work properly yet.
-p The ``-p'' flag is used to specify the filename to write the pid of the process
into. A relative path is relative to <pathrun in inn.conf>. The default is ``inn-
-S The ``-S'' flag specifies the name of the file to write the periodic staus to. If
the path is relative it is considered relative to <pathlog in inn.conf>. The
default is ``innfeed.status''.
-v When the ``-v'' flag is given, version information is printed to stderr and then
-x The ``-x'' flag is used to tell innfeed not to expect any article information from
innd but just to process any backlog files that exist and then exit.
-y The ``-y'' flag is used to allow dynamic peer binding. If this flag is used and
article information is received from innd that specifies an unknown peer, then the
peer name is taken to be the IP name too, and an association with it is created.
Using this it is possible to only have the global defaults in the innfeed.conf(5)
file, provided the peername as used by innd is the same as the ip name. Note that
innfeed with ``-y'' and no peer in innfeed.conf(5) would cause a problem that inn-
feed drops the first article.
-z The ``-z'' flag is used to cause each connection, in a parallel feed configuration,
to report statistics when the controller for the connections prints its statistics.
When using the ``-x'' option, the config file entry's ``initial-connections'' field will
be the total number of connections created and used--no matter how many big the batch
file, and no matter how big the ``max-connectiond'' field specifies. Thus a value of 0 for
``initial-connections'', means nothing will happen in ``-x'' mode.
Innfeed does not automatically grab the file out of out.going--this needs to be prepared
for it by external means.
Probably too many other bugs to count.
infeed.conf config file.
innfeed directory for backlog files.
Written by James Brister <email@example.com> for InterNetNews. This is revision 1.7, dated