docs/design.txt

   1 Nice: Design documentation
   2 ==========================
   3
   4 Socket ownership
   5 ----------------
   6
   7 For UDP candidates, one socket is created for each component and bound
   8 to INADDR_ANY. The same local socket is used for the host candidate,
   9 STUN candidate as well as the TURN candidate. The socket handles are
  10 stored to the Component structure.
  11
  12 The library will use the source address of incoming packets in order
  13 to identify from which remote candidates, if any (peer-derived
  14 candidates), packets were sent.
  15
  16 XXX: Describe the subtle issues with ICMP error handling when one
  17 socket is used to send to multiple destinations.
  18
  19 Real-time considerations
  20 ------------------------
  21
  22 One potential use for libnice code is providing network connectivity
  23 for media transport in voice and video telephony applications. This
  24 means that the libnice code is potentially run in real-time context
  25 (for instance under POSIX SCHED_FIFO/SHCED_RR scheduling policy) and
  26 ideally has deterministic execution time.
  27
  28 To be real-time friendly, operations with non-deterministic execution
  29 time (dynamic memory allocation, file and other resource access) should
  30 be done at startup/initialization phase. During an active session
  31 (connectivity has been established and non-STUN traffic is being sent),
  32 code should be as deterministic as possible.
  33
  34 Memory management
  35 -----------------
  36
  37 To work on platforms where available memory may be constrained, libnice
  38 should gracefully handle out of memory situations. If memory allocation
  39 fails, the library should return an error via the originating public
  40 library API function.
  41
  42 Use of glib creates some challenges to meet the above:
  43
  44 - A lot of glib's internal code assumes memory allocations will
  45   always work. Use of these glib facilities should be limited.
  46   While the glib default policy (see g_malloc() documentation) of terminating
  47   the process is ok for applications, this is not acceptable for library
  48   components.
  49 - Glib has weak support for preallocating structures needed at
  50   runtime (for instance use of timers creates a lot of memory
  51   allocation activity).
  52
  53 To work around the above limitations, the following guidelines need
  54 to be followed:
  55
  56 - Always check return values of glib functions.
  57 - Use safe variants: g_malloc_try(), etc
  58 - Current issues (last update 2007-05-04)
  59      - g_slist_append() will crash if alloc fails
  60
  61 Timers
  62 ------
  63
  64 Management of timers is handled by the 'agent' module. Other modules
  65 may use timer APIs to get timestamps, but they do not run timers.
  66
  67 Glib's timer interface has some problems that have affected the design:
  68
  69  - an expired timer will destroy the source (a potentially costly
  70    operation)
  71  - it is not possible to cancel, or adjust the timer expiration
  72    timer without destroying the associated source and creating
  73    a new one, which again causes malloc/frees and is potentially
  74    a costly operation
  75  - on Linux, glib uses gettimeofday() which is subject to clock
  76    skew, and no monotonic timer API is available
  77
  78 Due to the above, 'agent' code runs fixed interval periodic timers
  79 (started with g_timeout_add()) during candidate gathering, connectivity
  80 check, and session keepalive phases. Timer frequency is set separately
  81 for each phase of processing. A more elegant design would use dynamic
  82 timeouts, but this would be too expensive with glib timer
  83 infrastructure.
  84
  85 Control flow for NICE agent API (NiceAgentClass)
  86 ------------------------------------------------
  87
  88 The main library interface for applications using libnice is the
  89 NiceAgent GObject interface defined in 'nice/agent.h'.
  90
  91 The rough order of control follow is as follows:
  92
  93 - client should initialize glib with g_type_init()
  94 - creation of NiceAgent object instance (a UDP socket factory object
  95   instance must be given as a parameter)
  96 - setting agent properties such as STUN and TURN server addresses, and
  97   selection of ICE operating mode
  98 - connecting the GObject signals with g_signal_connect() to application
  99   callback functions
 100 - adding local interface addresses to use with
 101   nice_agent_add_local_address()
 102 - attach the mainloop context to connect the NiceAgent state machine to
 103   the application's event loop (using nice_agent_main_context_attach())
 104
 105 And continues when making an initial offer:
 106
 107 - creating the streams with nice_agent_add_stream()
 108 - the application should wait for the "candidate-gathering-done" signal
 109   before going forward (so that ICE can gather the needed set of local
 110   connectiviy candidates)
 111 - get the information needed for sending offer using
 112   nice_agent_get_local_candidates() and
 113   nice_agent_get_local_credentials()
 114 - client should now send the session offer
 115 - once it receives an answer, it can pass the information to NiceAgent
 116   using nice_agent_set_remote_candidates() and
 117   nice_agent_set_remote_credentials()
 118
 119 Alternatively, when answering to an initial offer:
 120
 121 - the first three steps are the same as above (making initial offer)
 122 - pass the remote session information to NiceAgent using
 123   nice_agent_set_remote_candidates() and
 124   nice_agent_set_remote_credentials()
 125 - client can send the answer to session offer
 126
 127 Special considerations for a SIP client:
 128
 129 - Upon sending the initial offer/answer, client should pick one
 130   local candidate as the default one, and encode it to the SDP
 131   "m" and "c" lines, in addition to the ICE "a=candidate" lines.
 132 - Client should connect to "new-selected-pair" signals. If this
 133   signal is received, a new candidate pair has been set as
 134   a selected pair (highest priority nominated pair). See
 135   ICE specification for a definition of "nominated pairs".
 136 - Once all components of a stream have reached the
 137   "NICE_COMPONENT_STATE_READY" state (as reported by
 138   "component-state-changed" signals), the client should check
 139   whether its original default candidate matches the latest
 140   selected pair. If not, it needs to send an updated offer
 141   it is in controlling mode. Before sending the offer, client
 142   should check the "controlling-mode" property to check that
 143   it still is in controlling mode (might change during ICE
 144   processing due to ICE role conflicts).
 145 - The "remote-attributes" SDP attribute can be created from
 146   the information provided by "component-state-changed" (which
 147   components are ready), "new-selected-pair" (which candidates
 148   are selected) and "new-remote-candidate" (peer-reflexive
 149   candidates discovered during processing) signals.
 150 - Supporting forked calls is not yet supported by the API (multiple
 151   sets of remote candidates for one local set of candidates).
 152
 153 Restarting ICE:
 154
 155 - ICE processing can be restarted by calling nice_agent_restart()
 156 - Restart will clean the set of remote candidates, so client must
 157   afterwards call nice_agent_set_remote_candidates() after receiving
 158   a new offer/answer for the restarted ICE session.
 159 - Restart will reinitialize the local credentials (see
 160   nice_agent_get_local_credentials()).
 161 - To use the "dribble" mode, client first has to initialize the stream with
 162   calling nice_agent_set_remote_candidates() with an empty set of
 163   candidates, and then start adding new remote candidates with
 164   nice_agent_add_remote_candidate().
 165      - XXX: initial plan, needs review...
 166 - Note that to modify the set of local candidates, a new stream
 167   has to be created. For the remote party, this looks like a ICE
 168   restart as well.
 169
 170 Handling fallback to non-ICE operation:
 171
 172 - If we are the offering party, and the remote party indicates
 173   it doesn't support ICE, we can use nice_agent_set_selected_pair()
 174   to force selection of a candidate pair (for remote party,
 175   the information on SDP 'm=' and 'c=' lines needs to be used
 176   to generate one remote candidate for each component of the
 177   streams). This function will halt all ICE processing (excluding
 178   keepalives), while still allowing to send and receive media (assuming
 179   NATs won't interfere).
 180
 181 Notes about sending media:
 182
 183 - Client may send media once all components of a stream have reached
 184   state of NICE_COMPONENT_STATE_CONNECTED or NICE_COMPONENT_STATE_READY,
 185   (as reported by "component-state-changed" signals), and a selected pair
 186   is set for all components (as reported by "new-selected-pair" signals).
 187
 188 STUN API
 189 --------
 190
 191 The underlying STUN library takes care of:
 192 - formatting and parsing STUN messages (lower layer),
 193 - running STUN transactions for different STUN usages (higher layer).
 194
 195 Applications should only need to use the higher layer API which then
 196 uses the lower layer API.
 197
 198 The following STUN usages are currently implemented by the
 199 transaction layer:
 200 - Binding discovery (RFC3489bis with RFC3489 backward compatibility)
 201 - ICE connectivity checks
 202
 203 The following usages are planned but not implemented currently:
 204 - Relay (TURN)
 205 - Binding keep-alive
 206
 207 STUN transaction API
 208 --------------------
 209
 210 STUN transaction are supported through a set of non-blocking functions.
 211 The application is responsible for blocking polling operation, so that
 212 it can run any number of STUN transactions and other work within the
 213 same thread:
 214 - Initialization and initiation of the transaction: stun_*_start()
 215 - I/O event polling: stun_*_fd() resp. stun_*_timeout() specify which
 216   file description resp. how long to wait for it
 217 - Incoming data processing: stun_*_process()
 218 - Timeout processing: stun_*_elapse()
 219 - Cancellation (at any time) of the transaction: stun_*_cancel()
 220
 221 On the "server" side, STUN requires that requests processing be
 222 indempotent, and there are no timeouts in the currently supported
 223 usages. As such, each usage is made of a single function that
 224 parses a request and formats an answer: stun_*_reply()
 225
 226 STUN message API
 227 ----------------
 228
 229 STUN message API provide thin wrappers to parse and format STUN
 230 messages. To achieve maximum cross-architectures portability and retain
 231 real-time friendliness, these functions are fully "computational" [1].
 232 They also make no assumption about endianess or memory alignment
 233 (reading single bytes or using memcpy()).
 234
 235 Message buffers are provided by the caller (so these can be
 236 preallocated). Because STUN uses a relatively computer-friendly binary
 237 format, STUN messages are stored in wire format within the buffers.
 238 There is no intermediary translation, so the APIs can operate directly
 239 with data received from or sent to the network.
 240
 241 [1] With two exceptions: creating a new message might require locking
 242 to ensure uniqueness; and OpenSSL which is used for cryptographic
 243 hashing and random number generation might access the system entropy
 244 pool, use threading synchronization...