Home Man
Today's Posts

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:
Select Section of Man Page:
Select Man Page Repository:

NetBSD 6.1.5 - man page for bpf (netbsd section 4)

BPF(4)				   BSD Kernel Interfaces Manual 			   BPF(4)

     bpf -- Berkeley Packet Filter raw network interface

     pseudo-device bpfilter

     The Berkeley Packet Filter provides a raw interface to data link layers in a protocol inde-
     pendent fashion.  All packets on the network, even those destined for other hosts, are
     accessible through this mechanism.

     The packet filter appears as a character special device, /dev/bpf.  After opening the
     device, the file descriptor must be bound to a specific network interface with the BIOCSETIF
     ioctl.  A given interface can be shared by multiple listeners, and the filter underlying
     each descriptor will see an identical packet stream.

     Associated with each open instance of a bpf file is a user-settable packet filter.  Whenever
     a packet is received by an interface, all file descriptors listening on that interface apply
     their filter.  Each descriptor that accepts the packet receives its own copy.

     Reads from these files return the next group of packets that have matched the filter.  To
     improve performance, the buffer passed to read must be the same size as the buffers used
     internally by bpf.  This size is returned by the BIOCGBLEN ioctl (see below), and can be set
     with BIOCSBLEN.  Note that an individual packet larger than this size is necessarily trun-

     Since packet data is in network byte order, applications should use the byteorder(3) macros
     to extract multi-byte values.

     A packet can be sent out on the network by writing to a bpf file descriptor.  The writes are
     unbuffered, meaning only one packet can be processed per write.  Currently, only writes to
     Ethernets and SLIP links are supported.

     The ioctl(2) command codes below are defined in <net/bpf.h>.  All commands require these

	   #include <sys/types.h>
	   #include <sys/time.h>
	   #include <sys/ioctl.h>
	   #include <net/bpf.h>

     Additionally, BIOCGETIF and BIOCSETIF require <net/if.h>.

     The (third) argument to the ioctl(2) should be a pointer to the type indicated.

	   BIOCGBLEN (u_int)
		   Returns the required buffer length for reads on bpf files.

	   BIOCSBLEN (u_int)
		   Sets the buffer length for reads on bpf files.  The buffer must be set before
		   the file is attached to an interface with BIOCSETIF.  If the requested buffer
		   size cannot be accommodated, the closest allowable size will be set and
		   returned in the argument.  A read call will result in EINVAL if it is passed a
		   buffer that is not this size.

	   BIOCGDLT (u_int)
		   Returns the type of the data link layer underlying the attached interface.
		   EINVAL is returned if no interface has been specified.  The device types, pre-
		   fixed with ``DLT_'', are defined in <net/bpf.h>.

	   BIOCGDLTLIST (struct bpf_dltlist)
		   Returns an array of the available types of the data link layer underlying the
		   attached interface:

			 struct bpf_dltlist {
				 u_int bfl_len;
				 u_int *bfl_list;

		   The available types are returned in the array pointed to by the bfl_list field
		   while their length in u_int is supplied to the bfl_len field.  ENOMEM is
		   returned if there is not enough buffer space and EFAULT is returned if a bad
		   address is encountered.  The bfl_len field is modified on return to indicate
		   the actual length in u_int of the array returned.  If bfl_list is NULL, the
		   bfl_len field is set to indicate the required length of an array in u_int.

	   BIOCSDLT (u_int)
		   Changes the type of the data link layer underlying the attached interface.
		   EINVAL is returned if no interface has been specified or the specified type is
		   not available for the interface.

		   Forces the interface into promiscuous mode.	All packets, not just those des-
		   tined for the local host, are processed.  Since more than one file can be lis-
		   tening on a given interface, a listener that opened its interface non-promis-
		   cuously may receive packets promiscuously.  This problem can be remedied with
		   an appropriate filter.

		   The interface remains in promiscuous mode until all files listening promiscu-
		   ously are closed.

		   Flushes the buffer of incoming packets, and resets the statistics that are
		   returned by BIOCGSTATS.

	   BIOCGETIF (struct ifreq)
		   Returns the name of the hardware interface that the file is listening on.  The
		   name is returned in the ifr_name field of ifr.  All other fields are unde-

	   BIOCSETIF (struct ifreq)
		   Sets the hardware interface associated with the file.  This command must be
		   performed before any packets can be read.  The device is indicated by name
		   using the ifr_name field of the ifreq.  Additionally, performs the actions of

		   Sets or gets the read timeout parameter.  The timeval specifies the length of
		   time to wait before timing out on a read request.  This parameter is initial-
		   ized to zero by open(2), indicating no timeout.

	   BIOCGSTATS (struct bpf_stat)
		   Returns the following structure of packet statistics:

			 struct bpf_stat {
				 uint64_t bs_recv;
				 uint64_t bs_drop;
				 uint64_t bs_capt;
				 uint64_t bs_padding[13];

		   The fields are:

			 bs_recv  the number of packets received by the descriptor since opened
				  or reset (including any buffered since the last read call);

			 bs_drop  the number of packets which were accepted by the filter but
				  dropped by the kernel because of buffer overflows (i.e., the
				  application's reads aren't keeping up with the packet traffic);

			 bs_capt  the number of packets accepted by the filter.

		   Enables or disables ``immediate mode'', based on the truth value of the argu-
		   ment.  When immediate mode is enabled, reads return immediately upon packet
		   reception.  Otherwise, a read will block until either the kernel buffer
		   becomes full or a timeout occurs.  This is useful for programs like rarpd(8),
		   which must respond to messages in real time.  The default for a new file is

	   BIOCSETF (struct bpf_program)
		   Sets the filter program used by the kernel to discard uninteresting packets.
		   An array of instructions and its length are passed in using the following

			 struct bpf_program {
				 u_int bf_len;
				 struct bpf_insn *bf_insns;

		   The filter program is pointed to by the bf_insns field while its length in
		   units of 'struct bpf_insn' is given by the bf_len field.  Also, the actions of
		   BIOCFLUSH are performed.

		   See section FILTER MACHINE for an explanation of the filter language.

	   BIOCVERSION (struct bpf_version)
		   Returns the major and minor version numbers of the filter language currently
		   recognized by the kernel.  Before installing a filter, applications must check
		   that the current version is compatible with the running kernel.  Version num-
		   bers are compatible if the major numbers match and the application minor is
		   less than or equal to the kernel minor.  The kernel version number is returned
		   in the following structure:

			 struct bpf_version {
				 u_short bv_major;
				 u_short bv_minor;

		   The current version numbers are given by BPF_MAJOR_VERSION and
		   BPF_MINOR_VERSION from <net/bpf.h>.	An incompatible filter may result in
		   undefined behavior (most likely, an error returned by ioctl(2) or haphazard
		   packet matching).

		   Sets or gets the receive signal.  This signal will be sent to the process or
		   process group specified by FIOSETOWN.  It defaults to SIGIO.

		   Sets or gets the status of the ``header complete'' flag.  Set to zero if the
		   link level source address should be filled in automatically by the interface
		   output routine.  Set to one if the link level source address will be written,
		   as provided, to the wire.  This flag is initialized to zero by default.

		   Enable/disable or get the ``see sent'' flag status.	If enabled, packets sent
		   by the host (not from bpf) will be passed to the filter.  By default, the flag
		   is enabled (value is 1).

		   Set (or get) ``packet feedback mode''.  This allows injected packets to be fed
		   back as input to the interface when output via the interface is successful.
		   The first name is meant for FreeBSD compatibility, the two others follow the
		   Get/Set convention.	Injected outgoing packets are not returned by BPF to
		   avoid duplication.  This flag is initialized to zero by default.

     bpf now supports several standard ioctl(2)'s which allow the user to do async and/or non-
     blocking I/O to an open bpf file descriptor.

	   FIONREAD (int)
		   Returns the number of bytes that are immediately available for reading.

	   FIONBIO (int)
		   Set or clear non-blocking I/O.  If arg is non-zero, then doing a read(2) when
		   no data is available will return -1 and errno will be set to EAGAIN.  If arg
		   is zero, non-blocking I/O is disabled.  Note: setting this overrides the time-
		   out set by BIOCSRTIMEOUT.

	   FIOASYNC (int)
		   Enable or disable async I/O.  When enabled (arg is non-zero), the process or
		   process group specified by FIOSETOWN will start receiving SIGIO's when packets
		   arrive.  Note that you must do an FIOSETOWN in order for this to take effect,
		   as the system will not default this for you.  The signal may be changed via

		   Set or get the process or process group (if negative) that should receive
		   SIGIO when packets are available.  The signal may be changed using BIOCSRSIG
		   (see above).

     The following structure is prepended to each packet returned by read(2):

	   struct bpf_hdr {
		   struct bpf_timeval bh_tstamp;
		   uint32_t bh_caplen;
		   uint32_t bh_datalen;
		   uint16_t bh_hdrlen;

     The fields, whose values are stored in host order, are:

	   bh_tstamp   The time at which the packet was processed by the packet filter.  This
		       structure differs from the standard struct timeval in that both members
		       are of type long.

	   bh_caplen   The length of the captured portion of the packet.  This is the minimum of
		       the truncation amount specified by the filter and the length of the

	   bh_datalen  The length of the packet off the wire.  This value is independent of the
		       truncation amount specified by the filter.

	   bh_hdrlen   The length of the BPF header, which may not be equal to sizeof(struct

     The bh_hdrlen field exists to account for padding between the header and the link level pro-
     tocol.  The purpose here is to guarantee proper alignment of the packet data structures,
     which is required on alignment sensitive architectures and improves performance on many
     other architectures.  The packet filter ensures that the bpf_hdr and the network layer
     header will be word aligned.  Suitable precautions must be taken when accessing the link
     layer protocol fields on alignment restricted machines.  (This isn't a problem on an Ether-
     net, since the type field is a short falling on an even offset, and the addresses are proba-
     bly accessed in a bytewise fashion).

     Additionally, individual packets are padded so that each starts on a word boundary.  This
     requires that an application has some knowledge of how to get from packet to packet.  The
     macro BPF_WORDALIGN is defined in <net/bpf.h> to facilitate this process.	It rounds up its
     argument to the nearest word aligned value (where a word is BPF_ALIGNMENT bytes wide).

     For example, if 'p' points to the start of a packet, this expression will advance it to the
     next packet:

	   p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)

     For the alignment mechanisms to work properly, the buffer passed to read(2) must itself be
     word aligned.  malloc(3) will always return an aligned buffer.

     A filter program is an array of instructions, with all branches forwardly directed, termi-
     nated by a return instruction.  Each instruction performs some action on the pseudo-machine
     state, which consists of an accumulator, index register, scratch memory store, and implicit
     program counter.

     The following structure defines the instruction format:

	   struct bpf_insn {
		   uint16_t code;
		   u_char  jt;
		   u_char  jf;
		   uint32_t k;

     The k field is used in different ways by different instructions, and the jt and jf fields
     are used as offsets by the branch instructions.  The opcodes are encoded in a semi-hierar-
     chical fashion.  There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
     BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.  Various other mode and operator bits are or'd into
     the class to give the actual instructions.  The classes and modes are defined in

     Below are the semantics for each defined BPF instruction.	We use the convention that A is
     the accumulator, X is the index register, P[] packet data, and M[] scratch memory store.
     P[i:n] gives the data at byte offset ``i'' in the packet, interpreted as a word (n=4),
     unsigned halfword (n=2), or unsigned byte (n=1).  M[i] gives the i'th word in the scratch
     memory store, which is only addressed in word units.  The memory store is indexed from 0 to
     BPF_MEMWORDS-1.  k, jt, and jf are the corresponding fields in the instruction definition.
     ``len'' refers to the length of the packet.

	   BPF_LD  These instructions copy a value into the accumulator.  The type of the source
		   operand is specified by an ``addressing mode'' and can be a constant
		   (BPF_IMM), packet data at a fixed offset (BPF_ABS), packet data at a variable
		   offset (BPF_IND), the packet length (BPF_LEN), or a word in the scratch memory
		   store (BPF_MEM).  For BPF_IND and BPF_ABS, the data size must be specified as
		   a word (BPF_W), halfword (BPF_H), or byte (BPF_B).  Arithmetic overflow when
		   calculating a variable offset terminates the filter program and the packet is
		   ignored.  The semantics of all the recognized BPF_LD instructions follow.

			 BPF_LD+BPF_W+BPF_ABS	 A <- P[k:4]
			 BPF_LD+BPF_H+BPF_ABS	 A <- P[k:2]
			 BPF_LD+BPF_B+BPF_ABS	 A <- P[k:1]
			 BPF_LD+BPF_W+BPF_IND	 A <- P[X+k:4]
			 BPF_LD+BPF_H+BPF_IND	 A <- P[X+k:2]
			 BPF_LD+BPF_B+BPF_IND	 A <- P[X+k:1]
			 BPF_LD+BPF_W+BPF_LEN	 A <- len
			 BPF_LD+BPF_IMM 	 A <- k
			 BPF_LD+BPF_MEM 	 A <- M[k]

		   These instructions load a value into the index register.  Note that the
		   addressing modes are more restricted than those of the accumulator loads, but
		   they include BPF_MSH, a hack for efficiently loading the IP header length.

			 BPF_LDX+BPF_W+BPF_IMM	  X <- k
			 BPF_LDX+BPF_W+BPF_MEM	  X <- M[k]
			 BPF_LDX+BPF_W+BPF_LEN	  X <- len
			 BPF_LDX+BPF_B+BPF_MSH	  X <- 4*(P[k:1]&0xf)

	   BPF_ST  This instruction stores the accumulator into the scratch memory.  We do not
		   need an addressing mode since there is only one possibility for the destina-

			 BPF_ST    M[k] <- A

		   This instruction stores the index register in the scratch memory store.

			 BPF_STX    M[k] <- X

		   The alu instructions perform operations between the accumulator and index reg-
		   ister or constant, and store the result back in the accumulator.  For binary
		   operations, a source mode is required (BPF_K or BPF_X).

			 BPF_ALU+BPF_ADD+BPF_K	  A <- A + k
			 BPF_ALU+BPF_SUB+BPF_K	  A <- A - k
			 BPF_ALU+BPF_MUL+BPF_K	  A <- A * k
			 BPF_ALU+BPF_DIV+BPF_K	  A <- A / k
			 BPF_ALU+BPF_AND+BPF_K	  A <- A & k
			 BPF_ALU+BPF_OR+BPF_K	  A <- A | k
			 BPF_ALU+BPF_LSH+BPF_K	  A <- A << k
			 BPF_ALU+BPF_RSH+BPF_K	  A <- A >> k
			 BPF_ALU+BPF_ADD+BPF_X	  A <- A + X
			 BPF_ALU+BPF_SUB+BPF_X	  A <- A - X
			 BPF_ALU+BPF_MUL+BPF_X	  A <- A * X
			 BPF_ALU+BPF_DIV+BPF_X	  A <- A / X
			 BPF_ALU+BPF_AND+BPF_X	  A <- A & X
			 BPF_ALU+BPF_OR+BPF_X	  A <- A | X
			 BPF_ALU+BPF_LSH+BPF_X	  A <- A << X
			 BPF_ALU+BPF_RSH+BPF_X	  A <- A >> X
			 BPF_ALU+BPF_NEG	  A <- -A

		   The jump instructions alter flow of control.  Conditional jumps compare the
		   accumulator against a constant (BPF_K) or the index register (BPF_X).  If the
		   result is true (or non-zero), the true branch is taken, otherwise the false
		   branch is taken.  Jump offsets are encoded in 8 bits so the longest jump is
		   256 instructions.  However, the jump always (BPF_JA) opcode uses the 32 bit k
		   field as the offset, allowing arbitrarily distant destinations.  All condi-
		   tionals use unsigned comparison conventions.

			 BPF_JMP+BPF_JA 	  pc += k
			 BPF_JMP+BPF_JGT+BPF_K	  pc += (A > k) ? jt : jf
			 BPF_JMP+BPF_JGE+BPF_K	  pc += (A >= k) ? jt : jf
			 BPF_JMP+BPF_JEQ+BPF_K	  pc += (A == k) ? jt : jf
			 BPF_JMP+BPF_JSET+BPF_K   pc += (A & k) ? jt : jf
			 BPF_JMP+BPF_JGT+BPF_X	  pc += (A > X) ? jt : jf
			 BPF_JMP+BPF_JGE+BPF_X	  pc += (A >= X) ? jt : jf
			 BPF_JMP+BPF_JEQ+BPF_X	  pc += (A == X) ? jt : jf
			 BPF_JMP+BPF_JSET+BPF_X   pc += (A & X) ? jt : jf

		   The return instructions terminate the filter program and specify the amount of
		   packet to accept (i.e., they return the truncation amount).	A return value of
		   zero indicates that the packet should be ignored.  The return value is either
		   a constant (BPF_K) or the accumulator (BPF_A).

			 BPF_RET+BPF_A	  accept A bytes
			 BPF_RET+BPF_K	  accept k bytes

		   The miscellaneous category was created for anything that doesn't fit into the
		   above classes, and for any new instructions that might need to be added.  Cur-
		   rently, these are the register transfer instructions that copy the index reg-
		   ister to the accumulator or vice versa.

			 BPF_MISC+BPF_TAX    X <- A
			 BPF_MISC+BPF_TXA    A <- X

     The BPF interface provides the following macros to facilitate array initializers:

	   BPF_STMT (opcode, operand)
	   BPF_JUMP (opcode, operand, true_offset, false_offset)

     The following sysctls are available when bpf is enabled:

     net.bpf.maxbufsize     Sets the maximum buffer size available for bpf peers.

     net.bpf.stats	    Shows bpf statistics.  They can be retrieved with the netstat(1)

     net.bpf.peers	    Shows the current bpf peers.  This is only available to the super
			    user and can also be retrieved with the netstat(1) utility.


     The following filter is taken from the Reverse ARP Daemon.  It accepts only Reverse ARP

	   struct bpf_insn insns[] = {
		   BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
		       sizeof(struct ether_header)),

     This filter accepts only IP packets between host and

	   struct bpf_insn insns[] = {
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
		   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),

     Finally, this filter returns only TCP finger packets.  We must parse the IP header to reach
     the TCP header.  The BPF_JSET instruction checks that the IP fragment offset is 0 so we are
     sure that we have a TCP header.

	   struct bpf_insn insns[] = {
		   BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
		   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
		   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),

     ioctl(2), read(2), select(2), signal(3), tcpdump(8)

     S. McCanne and V. Jacobson, "The BSD Packet Filter: A New Architecture for User-level Packet
     Capture", Proceedings of the 1993 Winter USENIX.

     The Enet packet filter was created in 1980 by Mike Accetta and Rick Rashid at Carnegie-Mel-
     lon University.  Jeffrey Mogul, at Stanford, ported the code to BSD and continued its devel-
     opment from 1983 on.  Since then, it has evolved into the ULTRIX Packet Filter at DEC, a
     STREAMS NIT module under SunOS 4.1, and BPF.

     Steven McCanne, of Lawrence Berkeley Laboratory, implemented BPF in Summer 1990.  The design
     was in collaboration with Van Jacobson, also of Lawrence Berkeley Laboratory.

     The read buffer must be of a fixed size (returned by the BIOCGBLEN ioctl).

     A file that does not request promiscuous mode may receive promiscuously received packets as
     a side effect of another file requesting this mode on the same hardware interface.  This
     could be fixed in the kernel with additional processing overhead.	However, we favor the
     model where all files must assume that the interface is promiscuous, and if so desired, must
     use a filter to reject foreign packets.

     Under SunOS, if a BPF application reads more than 2^31 bytes of data, read will fail in
     EINVAL.  You can either fix the bug in SunOS, or lseek to 0 when read fails for this reason.

     ``Immediate mode'' and the ``read timeout'' are misguided features.  This functionality can
     be emulated with non-blocking mode and select(2).

BSD					December 31, 2011				      BSD

All times are GMT -4. The time now is 10:22 AM.

Unix & Linux Forums Content Copyrightę1993-2018. All Rights Reserved.
Show Password