2 .\" Copyright (c) 2009, Sun Microsystems, Inc. All Rights Reserved.
3 .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License. You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing.
4 .\" See the License for the specific language governing permissions and limitations under the License. When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with
5 .\" the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
6 .TH SCTP 7P "Jul 30, 2009"
8 sctp, SCTP \- Stream Control Transmission Protocol
12 #include <sys/socket.h>
13 #include <netinet/in.h>
15 s = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP);
20 \fBs = socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP);\fR
25 \fBs = socket(AF_INET6, SOCK_STREAM, IPPROTO_SCTP);\fR
30 \fBs = socket(AF_INET6, SOCK_SEQPACKET, IPPROTO_SCTP);\fR
36 SCTP is a transport protocol layered above the Internet Protocol (IP), or the
37 Internet Protocol Version 6 (IPv6). SCTP provides a reliable, session oriented,
38 flow-controlled, two-way transmission of data. It is a message- oriented
39 protocol and supports framing of individual messages boundaries. An SCTP
40 association is created between two endpoints for data transfer which is
41 maintained during the lifetime of the transfer. An SCTP association is setup
42 between two endpoints using a four-way handshake mechanism with the use of a
43 cookie to guard against some types of denial of service (DoS) attacks. These
44 endpoints may be represented by multiple IP addresses.
47 An SCTP message includes a common SCTP header followed by one or more chunks.
48 Included in the common header is a 32-bit field which contains the checksum
49 (computed using CRC-32c polynomial) of the entire SCTP packet.
52 SCTP transfers data payloads in the form of DATA chunks. Each DATA chunk
53 contains a Transmission Sequence Number (TSN), which governs the transmission
54 of messages and detection of loss. DATA chunk exchanges follow the Transmission
55 Control Protocol's (TCP) Selective ACK (SACK) mechanism. The receiver
56 acknowledges data by sending SACK chunks, which not only indicate the
57 cumulative TSN range received, but also non-cumulative TSNs received, implying
58 gaps in the received TSN sequence. SACKs are sent using the delayed
59 acknowledgment method similar to TCP, that is, one SCTP per every other
60 received packet with an upper bound on the delay (when there are gaps detected
61 the frequency is increased to one every received packet). Flow and congestion
62 control follow TCP algorithms: Slow Start, Congestion Avoidance, Fast Recovery
63 and Fast retransmit. But unlike TCP, SCTP does not support half-close
64 connection and "urgent" data.
67 SCTP is designed to support a number of functions that are critical for
68 telephony signalling transport, including multi-streaming. SCTP allows data to
69 be partitioned into multiple streams that have the property of independent
70 sequenced delivery so that message loss in any one stream only affects delivery
71 within that stream. In many applications (particularly telephony signalling),
72 it is only necessary to maintain sequencing of messages that affect some
73 resource. Other messages may be delivered without having to maintain overall
74 sequence integrity. A DATA chunk on an SCTP association contains the Stream
75 Id/Stream Sequence Number pair, in addition to the TSN, which is used for
76 sequenced delivery within a stream.
79 SCTP uses IP's host level addressing and adds its own per-host collection of
80 port addresses. The endpoints of an SCTP association are identified by the
81 combination of IP address(es) and an SCTP port number. By providing the ability
82 for an endpoint to have multiple IP addresses, SCTP supports multi-homing,
83 which makes an SCTP association more resilient in the presence of network
84 failures (assuming the network is constructed to provided redundancy). For a
85 multi-homed SCTP association, a single address is used as the primary address,
86 which is used as the destination address for normal DATA chunk transfers.
87 Retransmitted DATA chunks are sent over alternate address(es) to increase the
88 probability of reaching the remote endpoint. Continued failure to send DATA
89 chunks over the primary address results in selecting an alternate address as
90 the primary address. Additionally, SCTP monitors the accessibility of all
91 alternate addresses by sending periodic "heartbeats" chunks. An SCTP
92 association supports multi-homing by exchanging the available list of addresses
93 during association setup (as part of its four-way handshake mechanism). An SCTP
94 endpoint is associated with a local address using the \fBbind\fR(3SOCKET) call.
95 Subsequently, the endpoint can be associated with additional addresses using
96 \fBsctp_bindx\fR(3SOCKET). By using a special value of \fBINADDR_ANY\fR with IP
97 or the unspecified address (all zeros) with IPv6 in the \fBbind()\fR or
98 \fBsctp_bindx()\fR calls, an endpoint can be bound to all available IP or IPv6
99 addresses on the system.
102 SCTP uses a three-way mechanism to allow graceful shutdown, where each endpoint
103 has confirmation of the DATA chunks received by the remote endpoint prior to
104 completion of the shutdown. An Abort is provided for error cases when an
105 immediate shutdown is needed.
108 Applications can access SCTP using the socket interface as a \fBSOCK_STREAM\fR
109 (one-to-one style) or \fBSOCK_SEQPACKET\fR (one-to-many style) socket type.
112 One-to-one style socket interface supports similar semantics as sockets for
113 connection oriented protocols, such as TCP. Thus, a passive socket is created
114 by calling the \fBlisten\fR(3SOCKET) function after binding the socket using
115 \fBbind()\fR. Associations to this passive socket can be received using
116 \fBaccept\fR(3SOCKET) function. Active sockets use the \fBconnect\fR(3SOCKET)
117 function after binding to initiate an association. If an active socket is not
118 explicitly bound, an implicit binding is performed. If an application wants to
119 exchange data during the association setup phase, it should not call
120 \fBconnect()\fR, but use \fBsendto\fR(3SOCKET)/\fBsendmsg\fR(3SOCKET) to
121 implicitly initiate an association. Once an association has been established,
122 \fBread\fR(2) and \fBwrite\fR(2) can used to exchange data. Additionally,
123 \fBsend\fR(3SOCKET), \fBrecv\fR(3SOCKET), \fBsendto()\fR,
124 \fBrecvfrom\fR(3SOCKET), \fBsendmsg()\fR, and \fBrecvmsg\fR(3SOCKET) can be
128 One-to-many socket interface supports similar semantics as sockets for
129 connection less protocols, such as UDP (however, unlike UDP, it does not
130 support broadcast or multicast communications). A passive socket is created
131 using the \fBlisten()\fR function after binding the socket using \fBbind()\fR.
132 An \fBaccept()\fR call is not needed to receive associations to this passive
133 socket (in fact, an \fBaccept()\fR on a one-to-many socket will fail).
134 Associations are accepted automatically and notifications of new associations
135 are delivered in \fBrecvmsg()\fR provided notifications are enabled. Active
136 sockets after binding (implicitly or explicitly) need not call \fBconnect()\fR
137 to establish an association, implicit associations can be created using
138 \fBsendmsg()\fR/\fBrecvmsg()\fR or \fBsendto()\fR/\fBrecvfrom()\fR calls. Such
139 implicit associations cannot be created using \fBsend()\fR and \fBrecv()\fR
140 calls. On an SCTP socket (one-to-one or one-to-many), an association may be
141 established using \fBsendmsg()\fR. However, if an association already exists
142 for the destination address specified in the \fImsg_name\fR member of the
143 \fImsg\fR parameter, \fBsendmsg()\fR must include the association id in
144 \fImsg_iov\fR member of the \fImsg\fR parameter (using \fBsctp_sndrcvinfo\fR
145 structure) for a one-to-many SCTP socket. If the association id is not
146 provided, \fBsendmsg()\fR fails with \fBEADDRINUSE\fR. On a one-to-one socket
147 the destination information in the \fImsg\fR parameter is ignored for an
148 established association.
151 A one-to-one style association can be created from a one-to-many association by
152 branching it off using the \fBsctp_peeloff\fR(3SOCKET) call; \fBsend()\fR and
153 \fBrecv()\fR can be used on such peeled off associations. Calling
154 \fBclose\fR(2) on a one-to-many socket will gracefully shutdown all the
155 associations represented by that one-to-many socket.
158 The \fBsctp_sendmsg\fR(3SOCKET) and \fBsctp_recvmsg\fR(3SOCKET) functions can
159 be used to access advanced features provided by SCTP.
162 SCTP provides the following socket options which are set using
163 \fBsetsockopt\fR(3SOCKET) and read using \fBgetsockopt\fR(3SOCKET). The option
164 level is the protocol number for SCTP, available from
165 \fBgetprotobyname\fR(3SOCKET).
169 \fB\fBSCTP_NODELAY\fR\fR
173 Turn on/off any Nagle-like algorithm (similar to \fBTCP_NODELAY\fR).
179 \fB\fBSO_RCVBUF\fR\fR
183 Set the receive buffer.
189 \fB\fBSO_SNDBUF\fR\fR
199 \fB\fBSCTP_AUTOCLOSE\fR\fR
203 For one-to-many style socket, automatically close any association that has been
204 idle for more than the specified number of seconds. A value of '0' indicates
205 that no associations should be closed automatically.
211 \fB\fBSCTP_EVENTS\fR\fR
215 Specify various notifications and ancillary data the user wants to receive.
221 \fB\fBSCTP_STATUS\fR\fR
225 Retrieve current status information about an SCTP association.
231 \fB\fBSCTP_GET_ASSOC_STATS\fR\fR
235 Gather and reset per endpoint association statistics.
244 #include <netinet/sctp.h>
246 struct sctp_assoc_stats stat;
249 int32_t len = sizeof (stat);
252 * Per endpoint stats use the socket descriptor for sctp association.
255 /* Gather per endpoint association statistics */
256 rc = getsockopt(sd, IPPROTO_SCTP, SCTP_GET_ASSOC_STATS, &stat, &len);
262 Extract from the modified header file:
269 * SCTP socket option used to read per endpoint association statistics.
271 #define SCTP_GET_ASSOC_STATS 24
274 * A socket user request reads local per endpoint association stats.
275 * All stats are counts except sas_maxrto, which is the max value
276 * since the last user request for stats on this endpoint.
278 typedef struct sctp_assoc_stats {
279 uint64_t sas_rtxchunks; /* Retransmitted Chunks */
280 uint64_t sas_gapcnt; /* Gap Acknowledgements Received */
281 uint64_t sas_maxrto; /* Maximum Observed RTO this period */
282 uint64_t sas_outseqtsns; /* TSN received > next expected */
283 uint64_t sas_osacks; /* SACKs sent */
284 uint64_t sas_isacks; /* SACKs received */
285 uint64_t sas_octrlchunks; /* Control chunks sent - no dups */
286 uint64_t sas_ictrlchunks; /* Control chunks received - no dups */
287 uint64_t sas_oodchunks; /* Ordered data chunks sent */
288 uint64_t sas_iodchunks; /* Ordered data chunks received */
289 uint64_t sas_ouodchunks; /* Unordered data chunks sent */
290 uint64_t sas_iuodchunks; /* Unordered data chunks received */
291 uint64_t sas_idupchunks; /* Dups received (ordered+unordered) */
292 } sctp_assoc_stats_t;
299 The ability of SCTP to use multiple addresses in an association can create
300 issues with some network utilities. This requires a system administrator to be
301 careful in setting up the system.
304 For example, the \fBtcpd\fR allows an administrator to use a simple form of
305 address/hostname access control. While \fBtcpd\fR can work with SCTP, the
306 access control part can have some problems. The \fBtcpd\fR access control is
307 only based on one of the addresses at association setup time. Once as
308 association is allowed, no more checking is performed. This means that during
309 the life time of the association, SCTP packets from different addresses of the
310 peer host can be received in the system. This may not be what the system
311 administrator wants as some of the peer's addresses are supposed to be blocked.
314 Another example is the use of IP Filter, which provides several functions such
315 as IP packet filtering (\fBipf\fR(1M)) and NAT \fBipnat\fR(1M)). For packet
316 filtering, one issue is that a filter policy can block packets from some of the
317 addresses of an association while allowing packets from other addresses to go
318 through. This can degrade SCTP's performance when failure occurs. There is a
319 more serious issue with IP address rewrite by NAT. At association setup time,
320 SCTP endpoints exchange IP addresses. But IP Filter is not aware of this. So
321 when NAT is done on a packet, it may change the address to an unacceptable one.
322 Thus the SCTP association setup may succeed but packets cannot go through
323 afterwards when a different IP address is used for the association.
327 \fBipf\fR(1M), \fBipnat\fR(1M), \fBndd\fR(1M), \fBioctl\fR(2), \fBclose\fR(2),
328 \fBread\fR(2), \fBwrite\fR(2), \fBaccept\fR(3SOCKET), \fBbind\fR(3SOCKET),
329 \fBconnect\fR(3SOCKET), \fBgetprotobyname\fR(3SOCKET),
330 \fBgetsockopt\fR(3SOCKET), \fBlibsctp\fR(3LIB), \fBlisten\fR(3SOCKET),
331 \fBrecv\fR(3SOCKET), \fBrecvfrom\fR(3SOCKET), \fBrecvmsg\fR(3SOCKET),
332 \fBsctp_bindx\fR(3SOCKET), \fBsctp_getladdrs\fR(3SOCKET),
333 \fBsctp_getpaddrs\fR(3SOCKET), \fBsctp_freepaddrs\fR(3SOCKET),
334 \fBsctp_opt_info\fR(3SOCKET), \fBsctp_peeloff\fR(3SOCKET),
335 \fBsctp_recvmsg\fR(3SOCKET), \fBsctp_sendmsg\fR(3SOCKET), \fBsend\fR(3SOCKET),
336 \fBsendmsg\fR(3SOCKET), \fBsendto\fR(3SOCKET), \fBsocket\fR(3SOCKET),
337 \fBipfilter\fR(5), \fBtcp\fR(7P), \fBudp\fR(7P), \fBinet\fR(7P),
338 \fBinet6\fR(7P), \fBip\fR(7P), \fBip6\fR(7P)
341 R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I.
342 Rytina, M. Kalla, L. Zang, V. Paxson, \fIRFC 2960, Stream Control Transmission
343 Protocol\fR, October 2000
346 L. Ong, J. Yoakum, \fIRFC 3286, An Introduction to Stream Control Transmission
347 Protocol (SCTP)\fR, May 2002
350 J. Stone, R. Stewart, D. Otis, \fIRFC 3309, Stream Control Transmission
351 Protocol (SCTP) Checksum Change\fR, September 2002.
355 A socket operation may fail if:
359 \fB\fBEPROTONOSUPPORT\fR\fR
362 The socket type is other than \fBSOCK_STREAM\fR and \fBSOCK_SEQPACKET\fR.
368 \fB\fBETIMEDOUT\fR\fR
371 An association was dropped due to excessive retransmissions.
377 \fB\fBECONNREFUSED\fR\fR
380 The remote peer refused establishing an association.
386 \fB\fBEADDRINUSE\fR\fR
389 A \fBbind()\fR operation was attempted on a socket with a network address/port
390 pair that has already been bound to another socket.
399 A \fBbind()\fR operation was attempted on a socket with an invalid network
409 A \fBbind()\fR operation was attempted on a socket with a "reserved" port
410 number and the effective user ID of the process was not the privileged user.