TCP(4) BSD Kernel Interfaces Manual TCP(4)
tcp -- Internet Transmission Control Protocol
socket(AF_INET, SOCK_STREAM, 0);
socket(AF_INET6, SOCK_STREAM, 0);
The TCP provides reliable, flow-controlled, two-way transmission of data. It is a byte-
stream protocol used to support the SOCK_STREAM abstraction. TCP uses the standard Internet
address format and, in addition, provides a per-host collection of ``port addresses''.
Thus, each address is composed of an Internet address specifying the host and network, with
a specific TCP port on the host identifying the peer entity.
Sockets using TCP are either ``active'' or ``passive''. Active sockets initiate connections
to passive sockets. By default TCP sockets are created active; to create a passive socket
the listen(2) system call must be used after binding the socket with the bind(2) system
call. Only passive sockets may use the accept(2) call to accept incoming connections. Only
active sockets may use the connect(2) call to initiate connections.
Passive sockets may ``underspecify'' their location to match incoming connection requests
from multiple networks. This technique, termed ``wildcard addressing'', allows a single
server to provide service to clients on multiple networks. To create a socket which listens
on all networks, the Internet address INADDR_ANY must be bound. The TCP port may still be
specified at this time; if the port is not specified the system will assign one. Once a
connection has been established the socket's address is fixed by the peer entity's location.
The address assigned the socket is the address associated with the network interface through
which packets are being transmitted and received. Normally this address corresponds to the
peer entity's network.
TCP supports a number of socket options which can be set with setsockopt(2) and tested with
TCP_NODELAY Under most circumstances, TCP sends data when it is presented; when outstand-
ing data has not yet been acknowledged, it gathers small amounts of output to
be sent in a single packet once an acknowledgement is received. For a small
number of clients, such as window systems that send a stream of mouse events
which receive no replies, this packetization may cause significant delays.
Therefore, TCP provides a boolean option, TCP_NODELAY (from <netinet/tcp.h>,
to defeat this algorithm.
TCP_MAXSEG By default, a sender- and receiver-TCP will negotiate among themselves to
determine the maximum segment size to be used for each connection. The
TCP_MAXSEG option allows the user to determine the result of this negotia-
tion, and to reduce it if desired.
TCP_MD5SIG This option enables the use of MD5 digests (also known as TCP-MD5) on writes
to the specified socket. In the current release, only outgoing traffic is
digested; digests on incoming traffic are not verified. The current default
behavior for the system is to respond to a system advertising this option
with TCP-MD5; this may change.
One common use for this in a NetBSD router deployment is to enable based
routers to interwork with Cisco equipment at peering points. Support for
this feature conforms to RFC 2385. Only IPv4 (AF_INET) sessions are sup-
In order for this option to function correctly, it is necessary for the
administrator to add a tcp-md5 key entry to the system's security associa-
tions database (SADB) using the setkey(8) utility. This entry must have an
SPI of 0x1000 and can therefore only be specified on a per-host basis at this
If an SADB entry cannot be found for the destination, the outgoing traffic
will have an invalid digest option prepended, and the following error message
will be visible on the system console: tcp_signature_compute: SADB lookup
failed for %d.%d.%d.%d.
TCP_KEEPIDLE TCP probes a connection that has been idle for some amount of time. The
default value for this idle period is 4 hours. The TCP_KEEPIDLE option can
be used to affect this value for a given socket, and specifies the number of
seconds of idle time between keepalive probes. This option takes an unsigned
int value, with a value greater than 0.
TCP_KEEPINTVL When the SO_KEEPALIVE option is enabled, TCP probes a connection that has
been idle for some amount of time. If the remote system does not respond to
a keepalive probe, TCP retransmits the probe after some amount of time. The
default value for this retransmit interval is 150 seconds. The TCP_KEEPINTVL
option can be used to affect this value for a given socket, and specifies the
number of seconds to wait before retransmitting a keepalive probe. This
option takes an unsigned int value, with a value greater than 0.
TCP_KEEPCNT When the SO_KEEPALIVE option is enabled, TCP probes a connection that has
been idle for some amount of time. If the remote system does not respond to
a keepalive probe, TCP retransmits the probe a certain number of times before
a connection is considered to be broken. The default value for this
keepalive probe retransmit limit is 8. The TCP_KEEPCNT option can be used to
affect this value for a given socket, and specifies the maximum number of
keepalive probes to be sent. This option takes an unsigned int value, with a
value greater than 0.
TCP_KEEPINIT If a TCP connection cannot be established within some amount of time, TCP
will time out the connect attempt. The default value for this initial con-
nection establishment timeout is 150 seconds. The TCP_KEEPINIT option can be
used to affect this initial timeout period for a given socket, and specifies
the number of seconds to wait before the connect attempt is timed out. For
passive connections, the TCP_KEEPINIT option value is inherited from the lis-
tening socket. This option takes an unsigned int value, with a value greater
The option level for the setsockopt(2) call is the protocol number for TCP, available from
In the historical BSD TCP implementation, if the TCP_NODELAY option was set on a passive
socket, the sockets returned by accept(2) erroneously did not have the TCP_NODELAY option
set; the behavior was corrected to inherit TCP_NODELAY in NetBSD 1.6.
Options at the IP network level may be used with TCP; see ip(4) or ip6(4). Incoming connec-
tion requests that are source-routed are noted, and the reverse source route is used in
There are many adjustable parameters that control various aspects of the NetBSD TCP behav-
ior; these parameters are documented in sysctl(7), and they include:
o RFC 1323 extensions for high performance
o Send/receive buffer sizes
o Default maximum segment size (MSS)
o SYN cache parameters
o Initial window size
o Hughes/Touch/Heidemann Congestion Window Monitoring algorithm
o Keepalive parameters
o newReno algorithm for congestion control
o Logging of connection refusals
o RST packet rate limits
o SACK (Selective Acknowledgment)
o ECN (Explicit Congestion Notification)
o Congestion window increase methods; the traditional packet counting or RFC 3465 Appro-
priate Byte Counting
A socket operation may fail with one of the following errors returned:
[EISCONN] when trying to establish a connection on a socket which already has one;
[ENOBUFS] when the system runs out of memory for an internal data structure;
[ETIMEDOUT] when a connection was dropped due to excessive retransmissions;
[ECONNRESET] when the remote peer forces the connection to be closed;
[ECONNREFUSED] when the remote peer actively refuses connection establishment (usually
because no process is listening to the port);
[EADDRINUSE] when an attempt is made to create a socket with a port which has already
[EADDRNOTAVAIL] when an attempt is made to create a socket with a network address for which
no network interface exists.
getsockopt(2), socket(2), inet(4), inet6(4), intro(4), ip(4), ip6(4), sysctl(7)
Transmission Control Protocol, RFC, 793, September 1981.
Requirements for Internet Hosts -- Communication Layers, RFC, 1122, October 1989.
The tcp protocol stack appeared in 4.2BSD.
BSD June 19, 2007 BSD