Add graphing code for #6443.
[tor-metrics-tasks/delber.git] / task-2718 / detector.tex
blob10822d51288384ab9471732713fa7f2a0f600922
1 \documentclass{article}
2 \begin{document}
3 \author{George Danezis\\{\tt gdane@microsoft.com}}
4 \title{An anomaly-based censorship-detection\\system for Tor}
5 \date{September 9, 2011}
6 \maketitle
8 \section{Introduction}
10 The Tor project is currently the most widely used anonymity and censorship
11 resistance system worldwide.
12 As a result, national governments occasionally or regularly block access
13 to its facilities for relaying traffic.
14 Major blocking might be easy to detect, but blocking from smaller
15 jurisdictions, with fewer users, could take some time to detect.
16 Yet, early detection may be key to deploying countermeasures.
17 We have designed an ``early warning'' system that looks for anomalies in
18 the volumes of connections from users in different jurisdictions and flags
19 potential censorship events.
20 Special care has been taken to ensure the detector is robust to
21 manipulations and noise that could be used to block without raising an
22 alert.
24 The detector works on aggregate number of users connecting to a fraction
25 of directory servers per day.
26 That set of statistics are gathered and provided by the Tor project in a
27 sanitised form to minimise the potential for harm to active users.
28 The data collection has been historically patchy, introducing wild
29 variations over time that is not due to censorship.
30 The detector is based on a simple model of the number of users per day per
31 jurisdiction.
32 That model is used to assess whether the number of users we observe is
33 typical, too high, or too low.
34 In a nutshell the prediction on any day is based on activity of previous
35 days locally as well as worldwide.
37 \section{The model intuition}
39 The detector is based on a model of the number of connections from every
40 jurisdiction based on the number of connections in the past as well as a
41 model of ``natural'' variation or evolution of the number of connections.
42 More concretely, consider that at time $t_i$ we have observed $C_{ij}$
43 connections from country $j$.
44 Since we are concerned with abnormal increases or falls in the volume of
45 connections we compare this with the number of connections we observed at
46 a past time $t_{i-1}$ denoted as $C_{(i-1)j}$ from the same country $j$.
47 The ratio $R_{ij} = C_{ij} / C_{(i-1)j}$ summarises the change in the
48 number of users.
49 Inferring whether the ratio $R_{ij}$ is within an expected or unexpected
50 range allows us to detect potential censorship events.
52 We consider that a ratio $R_{ij}$ within a jurisdiction $j$ is ``normal''
53 if it follows the trends we are observing in other jurisdictions.
54 Therefore for every time $t_i$ we use the ratios $R_{ij}$ of many
55 countries to understand the global trends of usage of Tor, and we compare
56 specific countries' ratios to this model.
57 If they are broadly within the global trends we assume no censorship is
58 taking place, otherwise we raise an alarm.
60 \section{The model details}
62 We model each data point $C_{ij}$ of the number of users connected at time
63 $t_i$ from country $j$ as a single sample of a Poisson process with a rate
64 $\lambda_{ij}$ modelling the underlying number of users.
65 The Poisson process allows us to take into account that in jurisdictions
66 with very few users we will naturally have some days of relatively low or
67 high usage---just because a handful of users may or may not use Tor in a
68 day.
69 Even observing zero users from such jurisdictions on some days may not be
70 a significant event.
72 We are trying to detect normal or abnormal changes in the rate of change
73 of the rate $\lambda_{ij}$ between time $t_i$ and a previous time
74 $t_{i-1}$ for jurisdiction $j$ compared with other jurisdictions.
75 This is $\lambda_{ij} / \lambda_{(i-1)j}$ which for jurisdictions with a
76 high number of users is very close to $C_{ij} / C_{(i-1)j} = R_{ij}$.
77 We model $R_{ij}$'s from all jurisdictions as following a Normal
78 distribution $N(m,v)$ with a certain mean ($m$) and variance ($v$) to be
79 inferred.
80 This is of course a modelling assumption.
81 We use a normal distribution because given its parameters it represents
82 the distribution with most uncertainty: as a result the model has higher
83 variance than the real world, ensuring that it gives fewer false alarms of
84 censorship.
86 The parameters of $N(m,v)$ are inferred directly as point estimates from
87 the readings in a set of jurisdictions.
88 Then the probability of a given country ratio $R_{ij}$ is compared with
89 that distribution: an alarm is raised if the probability of the ratio is
90 above or below a certain threshold.
92 \section{The model robustness}
94 At every stage of detections we follow special steps to ensure the
95 detection is robust to manipulation by jurisdictions interested in
96 censoring fast without being detected.
97 First the parameter estimation for $N(m,v)$ is hardened: we only use the
98 largest jurisdictions to model ratios and within those we remove any
99 outliers that fall outside four inter-quartile ranges of the median.
100 This ensures that a jurisdiction with a very high or very low ratio does
101 not influence the model of ratios (and can be subsequently detected as
102 abnormal).
104 Since we chose jurisdictions with many users to build the model of ratios,
105 we can approximate the rates $\lambda_{ij}$ by the actual observed number
106 of users $C_{ij}$.
107 On the other hand when we try to detect whether a jurisdiction has a
108 typical rate we cannot make this assumption.
109 The rate of a Poisson variable $\lambda_{ij}$ can be inferred by a single
110 sample $C_{ij}$ using a Gamma prior, in which case it follows a Gamma
111 distribution.
112 In practice (because we are using a single sample) this in turn can be
113 approximated using a Poisson distribution with parameter $C_{ij}$.
114 Using this observation we extract a range of possible rates for each
115 jurisdiction based on $C_{ij}$, namely $\lambda_{ij_{min}}$ and
116 $\lambda_{ij_{max}}$.
117 Then we test whether that full range is within the typical range
118 distribution---if not we raise an alarm.
120 \section{The parameters}
122 The deployed model considers a time interval of seven (7) days to model
123 connection rates (i.e. $t_i$ - $t_{i-1} = 7$ days).
124 The key reason for a weekly model is our observation that some
125 jurisdictions exhibit weekly patterns.
126 A `previous day' model would then raise alarms every time weekly patterns
127 emerge.
128 We use the 50 largest jurisdictions to build our models of typical ratios
129 of traffic over time---as expected most of them are in countries where no
130 mass censorship has been reported.
131 This strengthens the model as describing ``normal'' Tor connection
132 patterns.
134 We consider that a ratio of connections is typical if it falls within the
135 99.99~\% percentile of the Normal distribution $N(m,v)$ modelling ratios.
136 This ensures that the expected rate of false alarms is about $1 / 10000$,
137 and therefore only a handful a week (given the large number of
138 jurisdictions).
139 Similarly, we infer the range of the rate of usage from each jurisdiction
140 (given $C_{ij}$) to be the 99.99~\% percentile range of a Poisson
141 distribution with parameter $C_{ij}$.
142 This full range must be within the typical range of ratios to avoid
143 raising an alarm.
145 \section{Further work}
147 The detector uses time series of user connections to directory servers to
148 detect censorship.
149 Any censorship method that does not influence these numbers would as a
150 result not be detected.
151 This includes active attacks: a censor could substitute genuine requests
152 with requests from adversary-controlled machines to keep numbers within
153 the typical ranges.
155 A better model, making use of multiple previous readings, may improve the
156 accuracy of detection.
157 In particular, when a censorship event occurs there is a structural
158 change, and a model based on modelling the future of user loads before the
159 event will fail.
160 This is not a critical problem, as these ``false positives'' are
161 concentrated after real censorship events, but the effect may be confusing
162 to a reader.
163 On the other hand, a jurisdiction can still censor by limiting the rate of
164 censorship to be within the typical range for the time period concerned.
165 Therefore adapting the detector to run on longer periods would be
166 necessary to detect such attacks.
168 \end{document}