src/core/or/dataflow.md

   1 @tableofcontents
   2
   3 @page dataflow Data flow in the Tor process
   4
   5 We read bytes from the network, we write bytes to the network.  For the
   6 most part, the bytes we write correspond roughly to bytes we have read,
   7 with bits of cryptography added in.
   8
   9 The rest is a matter of details.
  10
  11 ### Connections and buffers: reading, writing, and interpreting.
  12
  13 At a low level, Tor's networking code is based on "connections".  Each
  14 connection represents an object that can send or receive network-like
  15 events.  For the most part, each connection has a single underlying TCP
  16 stream (I'll discuss counterexamples below).
  17
  18 A connection that behaves like a TCP stream has an input buffer and an
  19 output buffer.  Incoming data is
  20 written into the input buffer ("inbuf"); data to be written to the
  21 network is queued on an output buffer ("outbuf").
  22
  23 Buffers are implemented in buffers.c.  Each of these buffers is
  24 implemented as a linked queue of memory extents, in the style of classic
  25 BSD mbufs, or Linux skbufs.
  26
  27 A connection's reading and writing can be enabled or disabled.  Under
  28 the hood, this functionality is implemented using libevent events: one
  29 for reading, one for writing.  These events are turned on/off in
  30 main.c, in the functions connection_{start,stop}_{reading,writing}.
  31
  32 When a read or write event is turned on, the main libevent loop polls
  33 the kernel, asking which sockets are ready to read or write.  (This
  34 polling happens in the event_base_loop() call in run_main_loop_once()
  35 in main.c.)  When libevent finds a socket that's ready to read or write,
  36 it invokes conn_{read,write}_callback(), also in main.c
  37
  38 These callback functions delegate to connection_handle_read() and
  39 connection_handle_write() in connection.c, which read or write on the
  40 network as appropriate, possibly delegating to openssl.
  41
  42 After data is read or written, or other event occurs, these
  43 connection_handle_read_write() functions call logic functions whose job is
  44 to respond to the information.  Some examples included:
  45
  46    * connection_flushed_some() -- called after a connection writes any
  47      amount of data from its outbuf.
  48    * connection_finished_flushing() -- called when a connection has
  49      emptied its outbuf.
  50    * connection_finished_connecting() -- called when an in-process connection
  51      finishes making a remote connection.
  52    * connection_reached_eof() -- called after receiving a FIN from the
  53      remote server.
  54    * connection_process_inbuf() -- called when more data arrives on
  55      the inbuf.
  56
  57 These functions then call into specific implementations depending on
  58 the type of the connection.  For example, if the connection is an
  59 edge_connection_t, connection_reached_eof() will call
  60 connection_edge_reached_eof().
  61
  62 > **Note:** "Also there are bufferevents!"  We have vestigial
  63 > code for an alternative low-level networking
  64 > implementation, based on Libevent's evbuffer and bufferevent
  65 > code.  These two object types take on (most of) the roles of
  66 > buffers and connections respectively. It isn't working in today's
  67 > Tor, due to code rot and possible lingering libevent bugs.  More
  68 > work is needed; it would be good to get this working efficiently
  69 > again, to have IOCP support on Windows.
  70
  71
  72 #### Controlling connections ####
  73
  74 A connection can have reading or writing enabled or disabled for a
  75 wide variety of reasons, including:
  76
  77    * Writing is disabled when there is no more data to write
  78    * For some connection types, reading is disabled when the inbuf is
  79      too full.
  80    * Reading/writing is temporarily disabled on connections that have
  81      recently read/written enough data up to their bandwidth
  82    * Reading is disabled on connections when reading more data from them
  83      would require that data to be buffered somewhere else that is
  84      already full.
  85
  86 Currently, these conditions are checked in a diffuse set of
  87 increasingly complex conditional expressions.  In the future, it could
  88 be helpful to transition to a unified model for handling temporary
  89 read/write suspensions.
  90
  91 #### Kinds of connections ####
  92
  93 Today Tor has the following connection and pseudoconnection types.
  94 For the most part, each type of channel has an associated C module
  95 that implements its underlying logic.
  96
  97 **Edge connections** receive data from and deliver data to points
  98 outside the onion routing network.  See `connection_edge.c`. They fall into two types:
  99
 100 **Entry connections** are a type of edge connection. They receive data
 101 from the user running a Tor client, and deliver data to that user.
 102 They are used to implement SOCKSPort, TransPort, NATDPort, and so on.
 103 Sometimes they are called "AP" connections for historical reasons (it
 104 used to stand for "Application Proxy").
 105
 106 **Exit connections** are a type of edge connection. They exist at an
 107 exit node, and transmit traffic to and from the network.
 108
 109 (Entry connections and exit connections are also used as placeholders
 110 when performing a remote DNS request; they are not decoupled from the
 111 notion of "stream" in the Tor protocol. This is implemented partially
 112 in `connection_edge.c`, and partially in `dnsserv.c` and `dns.c`.)
 113
 114 **OR connections** send and receive Tor cells over TLS, using some
 115 version of the Tor link protocol.  Their implementation is spread
 116 across `connection_or.c`, with a bit of logic in `command.c`,
 117 `relay.c`, and `channeltls.c`.
 118
 119 **Extended OR connections** are a type of OR connection for use on
 120 bridges using pluggable transports, so that the PT can tell the bridge
 121 some information about the incoming connection before passing on its
 122 data.  They are implemented in `ext_orport.c`.
 123
 124 **Directory connections** are server-side or client-side connections
 125 that implement Tor's HTTP-based directory protocol.  These are
 126 instantiated using a socket when Tor is making an unencrypted HTTP
 127 connection.  When Tor is tunneling a directory request over a Tor
 128 circuit, directory connections are implemented using a linked
 129 connection pair (see below).  Directory connections are implemented in
 130 `directory.c`; some of the server-side logic is implemented in
 131 `dirserver.c`.
 132
 133 **Controller connections** are local connections to a controller
 134 process implementing the controller protocol from
 135 control-spec.txt. These are in `control.c`.
 136
 137 **Listener connections** are not stream oriented!  Rather, they wrap a
 138 listening socket in order to detect new incoming connections.  They
 139 bypass most of stream logic.  They don't have associated buffers.
 140 They are implemented in `connection.c`.
 141
 142 ![structure hierarchy for connection types](./diagrams/02/02-connection-types.png "structure hierarchy for connection types")
 143
 144 >**Note**: "History Time!" You might occasionally find reference to a couple types of connections
 145 > which no longer exist in modern Tor.  A *CPUWorker connection*
 146 >connected the main Tor process to a thread or process used for
 147 >computation.  (Nowadays we use in-process communication.)  Even more
 148 >anciently, a *DNSWorker connection* connected the main tor process to
 149 >a separate thread or process used for running `gethostbyname()` or
 150 >`getaddrinfo()`.  (Nowadays we use Libevent's evdns facility to
 151 >perform DNS requests asynchronously.)
 152
 153 #### Linked connections ####
 154
 155 Sometimes two channels are joined together, such that data which the
 156 Tor process sends on one should immediately be received by the same
 157 Tor process on the other.  (For example, when Tor makes a tunneled
 158 directory connection, this is implemented on the client side as a
 159 directory connection whose output goes, not to the network, but to a
 160 local entry connection. And when a directory receives a tunnelled
 161 directory connection, this is implemented as an exit connection whose
 162 output goes, not to the network, but to a local directory connection.)
 163
 164 The earliest versions of Tor to support linked connections used
 165 socketpairs for the purpose.  But using socketpairs forced us to copy
 166 data through kernelspace, and wasted limited file descriptors.  So
 167 instead, a pair of connections can be linked in-process.  Each linked
 168 connection has a pointer to the other, such that data written on one
 169 is immediately readable on the other, and vice versa.
 170
 171 ### From connections to channels ###
 172
 173 There's an abstraction layer above OR connections (the ones that
 174 handle cells) and below cells called **Channels**.  A channel's
 175 purpose is to transmit authenticated cells from one Tor instance
 176 (relay or client) to another.
 177
 178 Currently, only one implementation exists: Channel_tls, which sends
 179 and receiveds cells over a TLS-based OR connection.
 180
 181 Cells are sent on a channel using
 182 `channel_write_{,packed_,var_}cell()`. Incoming cells arrive on a
 183 channel from its backend using `channel_queue*_cell()`, and are
 184 immediately processed using `channel_process_cells()`.
 185
 186 Some cell types are handled below the channel layer, such as those
 187 that affect handshaking only.  And some others are passed up to the
 188 generic cross-channel code in `command.c`: cells like `DESTROY` and
 189 `CREATED` are all trivial to handle.  But relay cells
 190 require special handling...
 191
 192 ### From channels through circuits ###
 193
 194 When a relay cell arrives on an existing circuit, it is handled in
 195 `circuit_receive_relay_cell()` -- one of the innermost functions in
 196 Tor.  This function encrypts or decrypts the relay cell as
 197 appropriate, and decides whether the cell is intended for the current
 198 hop of the circuit.
 199
 200 If the cell *is* intended for the current hop, we pass it to
 201 `connection_edge_process_relay_cell()` in `relay.c`, which acts on it
 202 based on its relay command, and (possibly) queues its data on an
 203 `edge_connection_t`.
 204
 205 If the cell *is not* intended for the current hop, we queue it for the
 206 next channel in sequence with `append cell_to_circuit_queue()`.  This
 207 places the cell on a per-circuit queue for cells headed out on that
 208 particular channel.
 209
 210 ### Sending cells on circuits: the complicated bit.
 211
 212 Relay cells are queued onto circuits from one of two (main) sources:
 213 reading data from edge connections, and receiving a cell to be relayed
 214 on a circuit.  Both of these sources place their cells on cell queue:
 215 each circuit has one cell queue for each direction that it travels.
 216
 217 A naive implementation would skip using cell queues, and instead write
 218 each outgoing relay cell.  (Tor did this in its earlier versions.)
 219 But such an approach tends to give poor performance, because it allows
 220 high-volume circuits to clog channels, and it forces the Tor server to
 221 send data queued on a circuit even after that circuit has been closed.
 222
 223 So by using queues on each circuit, we can add cells to each channel
 224 on a just-in-time basis, choosing the cell at each moment based on
 225 a performance-aware algorithm.
 226
 227 This logic is implemented in two main modules: `scheduler.c` and
 228 `circuitmux*.c`.  The scheduler code is responsible for determining
 229 globally, across all channels that could write cells, which one should
 230 next receive queued cells.  The circuitmux code determines, for all
 231 of the circuits with queued cells for a channel, which one should
 232 queue the next cell.
 233
 234 (This logic applies to outgoing relay cells only; incoming relay cells
 235 are processed as they arrive.)
 236