% $Id: e2e-traffic.tex 1257 2004-05-04 03:21:15Z nickm $

\documentclass{llncs}

\usepackage{epsfig}
\usepackage{url}
\usepackage{amsmath}

\renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
\newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
\newcommand\XXXX[1]{{\small\bf [XXXX #1]}}

\newcommand\PMIX{P_{\mbox{\scriptsize MIX}}}
\newcommand\Pobserve{P_{\mbox{\scriptsize observe}}}
\newcommand\Ponline{P_{\mbox{\scriptsize online}}}
\newcommand\Pjunk{P_{\mbox{\scriptsize junk}}}
\newcommand\Pdelay{P_{\mbox{\scriptsize delay}}}

% LLNCS says that vectors should be bold-italic.  Past publications on this
% topic have all used arrows.  We use arrows.  Should we switch?
\newcommand\V[1]{\overrightarrow{#1}}
\newcommand\B[1]{\overline{#1}}
%\newcommand\V[1]{\vec{#1}}

\newenvironment{tightlist}{\begin{list}{$\bullet$}{
  \setlength{\itemsep}{0mm}
    \setlength{\parsep}{0mm}
    %  \setlength{\labelsep}{0mm}
    %  \setlength{\labelwidth}{0mm}
    %  \setlength{\topsep}{0mm}
    }}{\end{list}}

% Cut down on whitespace above and below figures displayed at head/foot of
% page.
\setlength{\textfloatsep}{3mm}
% Cut down on whitespace above and below figures displayed in middle of page
\setlength{\intextsep}{3mm}

\begin{document}
\title{Practical Traffic Analysis: \\
       Extending and Resisting Statistical Disclosure}

\author{Nick Mathewson and Roger Dingledine}
\institute{The Free Haven Project\\
\email{\{nickm,arma\}@freehaven.net}}

\maketitle
\pagestyle{empty}
%\centerline{\LARGE\bf *DRAFT*---not for publication}
%======================================================================
\begin{abstract}
We extend earlier research on mounting and resisting passive
long-term end-to-end traffic analysis attacks against anonymous message
systems,
% We relax the assumptions of earlier attacks
by describing how an
eavesdropper can learn sender-receiver connections even when the substrate
is a network of pool mixes, the attacker is non-global, and senders have
complex behavior or generate padding messages.
Additionally, we describe how an attacker can use information about
message distinguishability to speed the attack.
We simulate our attacks for a variety of
scenarios, focusing on the amount of information needed to link senders
to their recipients. In each scenario, we show that the intersection
attack is slowed but still succeeds against a steady-state mix network. We
find that the attack takes an impractical amount of time when message
delivery times are highly variable; when the attacker can observe very
little of the network; and when users pad consistently and the adversary
does not know how the network behaves in their absence.

\end{abstract}

%======================================================================
\section{Introduction}
\label{sec:intro}
Mix networks aim to allow senders to anonymously deliver messages to
recipients. One of the strongest attacks against current deployable
designs is the \emph{long-term intersection attack}. In
this attack, a passive eavesdropper observes a large volume of network
traffic and notices that certain recipients are more likely to receive
messages after particular senders have transmitted messages.
% Although these correlations are slight,
That is, if a sender (call her Alice) maintains a fairly consistent
pattern of recipients over time, the attacker can deduce Alice's recipients.

Researchers have theorized that these attacks should be extremely
effective in many real-world contexts, but so far it has been difficult
to reason about when these attacks would be successful and how long they
would take.

%% Put this stuff in the previous work section if you want.
%Kesdogan, Agrawal, and Penz propose an example of a long-term
%intersection attack in \cite{limits-open}, and expand on this attack in
%\cite{agrawal03}. Their {\it disclosure attack}
%assumes a fairly strict model of sender behavior and works against
%only a single batch mix.  (A batch mix waits until it receives $b$
%messages, then reorders and retransmits them all.)  Additionally, the
%disclosure attack requires an attacker to solve an NP-complete
%problem to reveal the connections between senders and recipients.
%
%Danezis presents an algorithmically simpler {\it statistical
%disclosure attack} \cite{statistical-disclosure} that requires far
%less computation.  But while this attack
%is easier to describe and implement, it assumes the same
%restrictive sender and network models as the original disclosure
%attack.

Here we extend a version of the long-term intersection attack called the
statistical disclosure attack \cite{statistical-disclosure} to work in
real-world circumstances. Specifically, whereas the original model for
this attack makes strong assumptions about sender behavior and only
works against a single batch mix, we show how an attacker can learn
Alice's regular recipients even when:

\begin{tightlist}
\item Alice chooses non-uniformly among her communication partners,
  and can send multiple messages in a single batch.
\item The attacker lacks {\it a priori} knowledge of the network's
  average behavior when Alice is not sending messages.
\item Mixes use a different batching algorithm, such as Mixmaster's
  dynamic-pool algorithm \cite{mixmaster-spec,trickle02} or its
  generalization \cite{pet2003-diaz}.
\item Alice uses a mix network (of any topology, with synchronous or
  asynchronous batching) to relay her messages through a succession of
  mixes, instead of using just a single mix.
\item Alice disguises when she is sending real messages by sending
  padding traffic to be dropped by mix nodes in the network.
\item The attacker can only view a subset of the messages entering and
  leaving the network (so long as this subset includes some messages
  from Alice and some messages to Alice's recipients).
\item The cover traffic generated by other senders changes
  slowly over time.  (We do not address this case completely.)
\end{tightlist}
Each deviation from the original
model reduces the rate at which the attacker learns Alice's recipients, and
increases the amount of traffic he must observe.

Additionally, we show how an attacker can exploit additional knowledge, such
as distinguishability between messages, to speed these attacks.  For example,
an attacker who sees message contents can take into account whether messages
are written in the same language or signed by the same pseudonym, and thereby
partition messages into different classes and analyze the classes
independently.
%\item {\it A priori} suspicion of certain messages having originated
%  or not originated from Alice.  For example, messages written in a
%  language Alice doesn't speak are unlikely to have been written
%  by Alice.

The attacks in this paper fail to work when:
\begin{tightlist}
\item Alice's behavior is not consistent over time.  If Alice does not
  produce enough traffic with the same recipients,
  the attacker cannot learn her behavior.
  %%We will quantify what `enough' means below.
    % if it's true that we will, we should probably shout it louder
    % in the abstract. guess we'll wait to see if we do. -RD
\item The attacker cannot observe how the network behaves in Alice's
  absence. If Alice always sends the same number of messages, in
  every round, forever, a passive attacker cannot learn who receives
  messages in Alice's absence.
  %% Our preliminary results suggest that this effect can be achieved with
  %% significantly less padding.
  % (Yes, but not unequivocally.  `more research is needed'. -NM)
\item The attacker cannot tell when Alice is originating messages.
  %% For example, the sender may be running her own mix
  %% node and injecting messages into it directly.
  % why leave this out? it sounds important. -RD
\end{tightlist}

We begin in Section~\ref{sec:previous-work} by presenting a brief
background overview on mix networks, traffic analysis, the disclosure
attack, and the statistical disclosure attack.  In
Section~\ref{sec:extending} we present our enhancements to the statistical
disclosure attack.  We present simulated experimental results
in Section~\ref{sec:simulation}, and close in Section~\ref{sec:conclusion}
with recommendations for resisting this class of attacks, implications
for mix network design, and a set of open questions for future work.

%======================================================================
\section{Previous work}
\label{sec:previous-work}
\label{subsec:mix-nets}
Chaum \cite{chaum-mix} proposed hiding the correspondence between sender and
recipient by wrapping messages in layers of public-key cryptography, and
relaying them through a path composed of \emph{mixes}. Each mix in turn
decrypts, delays, and re-orders messages, before relaying them toward their
destinations. % Chaum proved the security of a mix against a \emph{passive
%  adversary} who eavesdrops on all communications but is unable to observe
%the reordering inside the mix.
% This above statement is misleading, no? The proof only holds when
% Alice sends always/only once, IIRC. -NM
Because some mixes might be controlled by an
adversary, Alice can direct her messages through a sequence or `chain' of
mixes in a network, so that no single mix can link her to her recipient.

Many subsequent designs have been proposed, including Babel \cite{babel},
Mixmaster \cite{mixmaster-spec}, and Mixminion \cite{minion-design}.
We will not address the differences between these systems in any detail: from
the point of view of a long-term intersection attack, the internals of the
network are irrelevant so long as the attacker can observe messages entering
and leaving the network, and can guess when a message entering the network is
likely to leave.

Another class of anonymity designs aims to provide low-latency
connections for web browsing and other interactive services
\cite{web-mix:pet2000,freedom2-arch,tor-design,or-jsac98}.
We do not address these systems here because short-term timing and packet
counting attacks seem sufficient against them \cite{SS03}.

Attacks against mix networks aim to reduce the anonymity of users by
linking anonymous senders with the messages they send, by linking
anonymous recipients with the messages they receive, or by linking
anonymous messages with one another. For detailed lists of attacks,
consult \cite{back01,raymond00}.  Attackers can
trace messages through the network by observing network
traffic, compromising mixes, compromising keys, delaying messages
so they stand out from other traffic, or altering messages
in transit.  They can learn a given message's destination
by flooding the network with messages, replaying multiple copies
of a message, or shaping traffic to isolate a target message from
other unknown traffic \cite{trickle02}. Attackers can
discourage users from using honest mixes by making them unreliable
\cite{back01,casc-rep}. They can analyze intercepted message text to
look for commonalities between otherwise unlinked senders
\cite{rao-pseudonymity}.

\subsection{The long-term intersection attack}
Even if we foil all the above attacks, an adversary can
mount a \emph{long-term intersection attack} by correlating times
when senders and receivers are active \cite{disad-free-routes}.

A variety of countermeasures make intersection attacks harder.
Kesdogan's stop-and-go mixes
\cite{stop-and-go} provide probabilistic anonymity by letting users
specify message latencies, thereby broadening the range of time
when messages might leave the mix network. Similarly, batching strategies
\cite{trickle02} as in Mixmaster and Mixminion use message
pools to spread out the possible exit times for messages.

Rather than expanding the set of messages that might have
been sent by a suspect sender, other designs expand the set of
senders that might have sent a target message. A sender who
also runs a node in the mix network can conceal whether a
given message originated at her node or was relayed from another node
\cite{bennett:pet2003,tarzan:ccs02,crowds:tissec}. But even with this
approach, the adversary can observe whether certain traffic patterns are
present when a user is online (possibly sending) and absent when a user is
offline (certainly not sending) \cite{wright02,wright03}.

A sender can also conceal whether she is currently active by consistently
sending decoy (dummy) traffic. Pipenet \cite{pipenet} conceals
traffic patterns by constant padding on every link. Unfortunately, a
single user can shut down this network simply by not sending.
%%Backing
%%off even a little bit from this constant-padding scheme has been thought to
%%allow the
%%above intersection attack, and in some systems may even introduce
%%conspicuous gaps in traffic that can be followed through the network
%%\cite{defensive-dropping}.
%
% I might be wrong, but doesn't the cite suggest that 'backing off even a
% little' is actually kinda safe?
%
Berthold and Langos aim to increase the
difficulty of intersection attacks with a scheme for preparing plausible
dummy traffic and having other nodes send it on Alice's behalf when she is
offline
\cite{langos02}, but their design has many practical problems.

Finally, note that while the adversary can perform this long-term
intersection attack entirely passively, active attacks (such as
blending attacks \cite{trickle02} against a suspected sender) can help
him reduce the set of suspects at each round.
%For example, performing
%blending attacks \cite{trickle02} against a suspected sender can greatly
%speed the attack.
%Danezis and Sassaman propose a ``heartbeat'' dummy
%scheme \cite{danezis:wpes2003} where dummies are sent from a node in
%the network back to itself, creating an early warning system to detect
%if the adversary is launching such a blending attack.

\subsection{The disclosure attack}
\label{subsec:disclosure-attack}
In 2002, Kesdogan, Agrawal, and Penz presented the disclosure
attack \cite{limits-open}, an intersection attack against a single
sender on a single batch mix.

The disclosure attack assumes a global passive eavesdropper interested in
learning the recipients of a single sender Alice. It assumes that Alice
sends messages to $m$ recipients; that Alice sends a single message
(recipient chosen at random from $m$) in each batch of $b$ messages;
and that the recipients of the other $b-1$ messages are chosen at random
from the set of $N$ possible recipients.

The attacker observes the messages leaving the mix and
constructs sets $R_i$ of recipients receiving messages in batch $i$.
The attacker then performs an NP-complete computation to identify $m$
mutually disjoint recipient sets $R_i$, so that each of Alice's
recipients is necessarily contained in exactly one of the sets.
Intersecting these sets with subsequent recipient sets reveals Alice's
recipients.

% \XXXX{Give the result formulas in the disclosure paper.}

% \XXXX{The above may not be 100\% accurate; must re-read the paper.}

\subsection{The statistical disclosure attack}
\label{subsec:statistical-disclosure}
In 2003, Danezis presented the statistical disclosure
attack \cite{statistical-disclosure}, which makes the same operational
assumptions as the original disclosure attack but is far easier to implement
in terms of storage, speed, and algorithmic complexity.  Unlike its
predecessor, statistical disclosure only reveals {\it likely} recipients; it
does not identify Alice's recipients with certainty.

In the statistical disclosure attack, we model Alice's behavior as an
unknown vector $\V{v}$ whose elements correspond to the
probability of Alice sending a message to each of the $N$ recipients
in the system.  The elements of $\V{v}$ corresponding to Alice's $m$
recipients will be $1/m$; the other $N-m$ elements of $\V{v}$ will
be $0$.  We model the behavior of the covering ``background'' traffic
sent by other users
as a known vector $\V{u}$ each of whose $N$ elements is $1/N$.

The attacker derives from each output round $i$ an observation vector
$\V{o_i}$, each of whose elements corresponds to the probability of
Alice's having sent a message to each particular recipient in that round.
That is, in
a round $i$ where Alice has sent a message, each element of $\V{o_i}$ is
$1/b$ if it corresponds to a recipient who has received a message,
and $0$ if it does not.
Taking the arithmetic mean $\B{O}$ of a large set of these observation
vectors gives (by the law of large numbers):
\[\B{O} = \frac{1}{t}\sum_{i=i}^{t} \V{o_i} \approx
  \frac{\V{v} + (b-1)\V{u}}{b} \]
From this, the attacker estimates Alice's behavior:
\[\V{v} \approx b\frac{\sum_{i=1}^t \V{o_i}}{t} - (b-i)\V{u}\]

Danezis also derives a precondition that the attack will only succeed when
\( m < \frac{N}{b-1} \), and calculates the expected number of rounds to
succeed (with $95\%$ confidence for security parameter $l=2$ and $99\%$
confidence for $l=3$) \cite{gd-thesis}:
% This equation (from the statistical disclosure paper) is (George says) wrong.
%
%\[t > \left[ m \cdot l \left(\sqrt{\frac{m-1}{m}} +
%                       \sqrt{\frac{N-1}{N^2}(b-1)} \right) \right]^2 \]
%
% This equation (from George's thesis) is (George says) correct:
\[t > \left[ m \cdot l \left(\sqrt{\frac{N-1}{N}(b-1)} +
             \sqrt{\frac{N-1}{N^2}(b-1)+\frac{m-1}{m}} \right) \right] ^2 \]

%======================================================================
\section{Extending the statistical disclosure attack}
\label{sec:extending}
\subsection{Broadening the attack}
\label{subsec:broadening}
Here we examine ways to extend Danezis's statistical
disclosure attack to systems more closely resembling real-world mix networks.
We will simulate the time and information requirements for several of these
attacks in Section~\ref{sec:simulation} below.

\subsubsection{Complex senders, unknown background traffic:}
% \label{subsubsec:complex-senders}
First, we relax the requirements related to sender behavior.
We allow Alice to choose among her recipients with non-uniform
probability, and to send multiple messages in a single batch.  We also
remove the assumption that the attacker has full knowledge of the
distribution $\V{u}$ of cover traffic sent by users other than Alice.

To model Alice's varying number of messages, we use a probability function
$P_m$ such
that in every round Alice sends $n$ messages with probability
$P_m(n)$.  We still use a behavior vector $\V{v}$ to represent the
probability of Alice sending to each recipient,
but we no longer require Alice's recipients to have a
uniform $1/m$ probability.   Alice's expected contribution to each round
is thus $\V{v} \sum_{n=0}^{\infty} n P_m(n)$.

To mount the attack, the attacker first obtains an
estimate of the background distribution $\V{u}$ by observing a large number
$t'$ of batches to
which Alice has {\it not} contributed any messages.\footnote{The attack can
  still proceed if few such Alice-free batches exist, so long as Alice
  contributes more to some batches than to others.  Specifically, the
  approach described below (against pool mixes and mix networks)
  can exploit differences
  between low-Alice and high-Alice batches to infer background behavior.}
For each such
batch $i$, the attacker constructs a vector $\V{u_i}$, whose elements are
$1/b$ for recipients that have received a message in that batch, and
$0$ for recipients that have not.  The attacker then estimates
the background distribution $\V{u}$ as:
\[\V{u} \approx \B{U} = \frac{1}{t'}\sum_{i=1}^{t'}\V{u_i} \]

The attacker then observes, for each round $i$ in which Alice {\it does}
send a message, the number of messages $m_i$ sent by Alice, and
computes observations $\V{o_i}$ as before.  Taking the arithmetic mean of these
$\V{o_i}$ gives us
\[\B{O} = \frac{1}{t}\sum_{i=1}^{t}{\V{o_i}}
            \approx
              \frac{\B{m} \cdot \V{v} + (b-\B{m})\B{U}}{b}
              \mbox{\ where\ } \B{m} = \frac{1}{t}\sum{m_i} \]

From this, the attacker estimates Alice's behavior as
\[\V{v} \approx \frac{1}{\B{m}}
            \left[ b \cdot \B{O} - (b-\B{m})\B{U} \right] \]

\subsubsection{Attacking pool mixes and mix networks:}
% \label{subsubsec:complex-mix} can't usefully label subsubsections.
Most designs have already abandoned fixed-batch mixes in
favor of other algorithms that better hide the relation
between senders and recipients.  Such algorithms include
timed dynamic-pool mixes, generalized mixes, and randomized versions of
each \cite{pet2003-diaz,trickle02}.  Rather than reordering and
relaying all messages whenever a fixed number $b$ arrive,
these algorithms store received messages in a {\it pool}, and at fixed
intervals relay a {\it fraction} of the pooled messages based on the pool's
current size.

When attacking such a mix, the attacker no longer knows for certain
which batches contain messages from Alice.  Instead,
the attacker can only estimate, for each batch of output messages,
the probability that the batch includes one or more of Alice's messages.

Following D\'{\i}az and Serjantov's approach in \cite{pet2003-diaz}, we treat
these mixing algorithms as follows: a mix relays a
number of messages at the end of each round, depending on how many
messages it is currently storing.  All messages in the mix's pool at the end
of a round have an equal probability of being included in that round's batch.
Thus, we can characterize the mix's pooling algorithm as a probability
function $\PMIX(b|s)$---the probability that the mix relays $b$ messages
when it has $s$ messages in the pool.
%Clearly, $\forall s,
%\sum_{b=0}^{s}\PMIX(b|s) = 1$: the mix will always relay between $0$
%and $s$ messages.

%In the case of a timed dynamic-pool mix, this distribution is:
%\[ P_{TDP}(b|s) =
%  \left\{ \begin{array}{l}
%    1 \mbox{\ if\ } s>\mbox{min-pool},
%      b = min(s-\mbox{min-pool}, \left\lfloor s*\mbox{rate} \right\rfloor) \\
%    0 \mbox{\ otherwise} \end{array} \right. \]
%For a binomial timed dynamic-pool mix,
%\[ P_{BTDP}(b|s) =
%   \left\{ \begin{array}{l}
%           \frac{b_0}{s}^b(1-\frac{b_0}{s})^{s-b} \mbox{\ if\ }
%              P_{TDP}(b_0|s) = 1 \\
%          0 \mbox{\ otherwise} \end{array} \right. \]

We denote by $P_R^i(r)$ the probability that a message arriving in round
$i$ leaves the mix $r$ rounds later.
%%In future discussion, we assume for
%%the sake of simplicity that the mix has reached a steady state, so that
%%$P_R^i(r) = P_R^j(r)$, for all $i$ and $j$.
We assume that the attacker has a fair estimate of $P_R$.\footnote{The
attacker can estimate $P_R$ by
sending test messages through the mix, or by counting the
messages entering and leaving the mix and deducing the pool size.}
%Dummy messages can hamper these techniques to an extent discussed
%in \cite{dummy-msgs}.
Now, when Alice sends a message in round
$i$, the attacker observes round $i$ through some later round $i+k$,
choosing $k$ so that $\sum_{j=k+1}^{\infty} P_R^i(j)$ is negligible.
The attacker then uses $P_R$ to
compute $\B{O_w}$, the mean of the observations from these rounds,
weighted by the expected number of messages from Alice exiting in each
round:
\[ \B{O_w} = \sum_i \sum_{r=0}^k P_R^i(r) \cdot m_i \cdot \V{o_{i+r}}
   \approx \frac{\B{m} \cdot \V{v} + (\B{n}-\B{m})\V{u}}{\B{n}} \]

To solve for Alice's behavior $\V{v}$, the attacker now needs an estimate
for the background $\V{u}$.  The attacker gets this by
averaging observations $\V{u_i}$ from batches with a negligible probability of
including messages from Alice.  Such batches, however, are not
essential: If the attacker chooses a set of $\V{u_i}$ such that each
round contains (on average) a small number $\delta_a>0$ of messages from
Alice, averaging them gives:
\[\B{U'} \approx \frac{\delta_a}{\B{n}} \V{v} +
                   \frac{1-\delta_a}{\B{n}} \V{u} \]
and the attacker can solve again for $\V{v}$ in the earlier equation for
$\B{O_w}$, now using
\[\V{u} \approx \frac{1}{1-\delta_a}
               \left[ \B{n} \cdot \B{U'} - \delta_a \cdot \V{v} \right] \]

Senders can also direct their messages through multi-hop paths
in a network of mixes.  While using a mix network increases the effort
needed to observe all messages leaving the system, it
has no additional effect on intersection attacks beyond changing the
system's delaying characteristics.
Assume (for simplicity) that all mixes have the same
delay distribution $P_R$, and that Alice chooses paths of length $\ell_0$.
The chance of
a message being delayed by a further $d$ rounds is now
\[  P_R'(\ell_0+d) = \binom{\ell_0+d-1}{d} (1-P_D)^{\ell_0} P_D^d \]
%If Alice chooses her path length probabilistically according to
%$P_L(\ell)$, we have
%\[ P_R'(r) =
%   \sum_{\ell=1}^r P_L(\ell) \binom{r-1}{r-\ell} (1-P_D)^\ell P_d^{r-\ell} \]

% \XXXX{ How can the attacker estimate $P_L$ for Alice?  What if he can't?}

Danezis has independently extended statistical disclosure to pool mixes
\cite{gd-thesis}; Danezis and Serjantov have analyzed it in detail
\cite{statistical-disclosure04}.

\subsubsection{Dummy traffic:}
%\label{subsubsec:dummy-traffic}
Alice can also reduce the impact of traffic analysis by
periodically sending messages that are dropped
  inside\footnote{Alice might
  also send dummy traffic to ordinary recipients.  This approach
  has its problems: how should Alice generate cover texts, or get the list of
  all possible recipients?  In any case, it is
  unclear whether Alice can obscure her true recipients without sending equal
  volumes of mail to all of her non-recipients as well, which is impractical.}
the network.

Although this padding can slow or stop the attacker (as
discussed below in Section \ref{sec:simulation}), the change in the attack
is trivial: Alice's behavior
vector $\V{v}$ no longer adds to $1$, since there is now a chance that a
message from Alice will not reach any recipient.  Aside from this, the
attack can proceed as before, so long as Alice sends more messages
(including dummies) in some rounds than in others.

%%\XXXX{Is the rest of this section even vaguely germane or realistic?}
%%
%%On the other hand, suppose that some of Alice's dummy traffic reaches
%% recipients other than those with whom she ordinarily communicates\footnote{This
%%  discussion ignores the question of how Alice is to generate the text
%%  of these cover messages, if messages typically leave the mix-net
%%  unencrypted.}
%% %% If some messages in the mix-net are
%% %%  unencrypted, it will be difficult or impossible for Alice to compose
%% %%  plausible unencrypted cover texts for most possible recipients.  On
%% %%  the other hand, if all messages in the mix-net are encrypted, Alice
%% %%  can easily give her dummy messages junk padding, encrypted so that
%% %%  only the recipient of each dummy message can distinguish it from a
%% %%  real message.}.
%% Suppose that every message Alice sends has a
%% probability $P_c$ of being a cover message, and that recipients for
%% cover messages are chosen from a distribution vector $\V{v_c}$.
%% Now, instead of converging on Alice's true pattern of recipients
%% $\V{v}$, the attacker instead learns
%% \[ \V{v'} = (1-P_c)\V{v} + P_c\V{v_c} \]
%% In other words, although the attacker cannot deduce which recipients
%% receive messages from Alice, he can instead learn a superset of
%% Alice's recipients.
%%
%% Clearly, if Alice chooses $\V{v_c}$ such that she sends messages with
%% equal probability to all recipients, she can completely hide the
%% actual recipients of her messages.  In practice, however, this
%% requirement in unrealistic:  doing so will require that she send
%% messages to {\it every} participant in the system---potentially as
%% many recipients as there are active email addresses.  Such a padding
%% regime results in traffic quadratic in the number of participants in
%% the system, and thus scales poorly.

\subsubsection{Partial observation:}
%\label{subsubsec:partial}
Until now, we have required that the attacker, as a global passive
adversary, observe all the messages entering and leaving the system
(at least, all the messages sent by Alice, and all the messages
reaching Alice's recipients).  This is not so difficult
as it might seem: to be a ``global'' adversary against
Alice, an attacker need only eavesdrop upon Alice, and upon the mixes
that deliver messages to recipients. (Typically, not all mixes do
so. For example, only about one third of current Mixminion servers support
delivery.)
%For Mixmaster, it's about one half.

A non-global attacker's characteristics depend on which parts of the
network he can observe.  If the attacker
eavesdrops on a fraction of the {\it mixes}, he
 receives a sample\footnote{But possibly a biased
  sample, depending on Alice's path selection algorithm.} of the
messages entering or leaving the system. If such an attacker can see some
messages from Alice and some messages to her recipients, he can guess
Alice's recipients, but will require more rounds of observation.

Alternatively, an attacker who eavesdrops on a fraction of the {\it
  users} receives {\it all} messages sent to or
from those users but no messages sent to or from other users.  So long
as one of these users is Alice, the network (to such an attacker) is as
if the messages sent by Alice to
unobserved recipients were dummy messages.  Now the attack converges
only against observed recipients: the attacker learns
which of observed recipients
get messages from Alice, and which do not.

\subsubsection{Time-variant background traffic:}
%\label{subsubsec:time-variant}
If Alice's behavior changes completely and radically over time, long-term
intersection attacks cannot proceed: the attacker cannot make enough
observations of any version or subset of Alice's behavior to converge
on a $\B{v}$ for any of them.

On the other hand, if Alice's behavior $\V{v}$ remains consistent
while the behavior of the background traffic $\V{u}$ changes slowly, the
attacker still has some hope.  Rather than estimating a single $\B{U}$
from rounds to which Alice does not contribute, the attacker
estimates a series of successive $\B{U_i}$ values based on the
average behavior of the network during comparatively shorter
durations of time.  The attacker observes $\V{o_i}$ and
computes the average of $\V{o_i} - \B{U_i}$, as before.  Now,
\[ \V{v} \propto \frac{1}{t}\sum_{i=1}^t \V{o_i} - \B{U_i}
\]
So if an attacker can get good local estimates to $\V{u}$, the
intersection attack proceeds as before.

\subsubsection{Attacking recipients:}
%\label{subsubsec:recipients}
Finally, we note that an attacker can find recipients as well as senders by
using slightly more storage and the same computational cost.

Suppose the attacker wants to know who is sending
anonymous messages to a given recipient Bob.  The analysis remains the
same: the attacker compares sender behavior in rounds from which Bob
probably receives messages with behavior in rounds from which Bob
probably doesn't receive messages.  The only complication is that the
attacker cannot tell in advance when Bob will receive a message.
Therefore, the attacker must remember a window of recent observations
at all times,
such that if Bob later receives a message, the chance is negligible that
the message was sent before the first round in the window.

\subsection{Strengthening the attack}
\label{subsec:strenghtening}
Section \ref{subsec:broadening} extended the original
statistical disclosure attack to link senders and recipients in a
broader range of circumstances.
%In Section~\ref{sec:simulation} we will
%show that these extensions force the attacker to observe an increasingly
%large number of rounds of traffic.
In this section,
%to work in new situations
%(at the expensive of needing increased traffic)
we discuss ways to reduce the required amount of traffic
by incorporating additional information.

\subsubsection{Partitioning messages:}
%\label{subsubsec:full-linkability}
The attack is simplified if some output messages are
{\it linkable}---that is, if they are
likelier to originate from the same sender than are two randomly chosen
messages.  We consider a special case of linkability, in which we can
{\it partition} messages into separate classes such that
messages in the same class are likelier to have the same sender than
messages chosen at random.

For example, in a typical
pseudonym service, each sender has one or more pseudonyms and each
delivered messages is associated with a pseudonym.
To link senders and recipients, an attacker only needs to link senders to
their pseudonyms.  He can do so by treating
pseudonyms as virtual message
destinations: instead of collecting observations $\V{o_i}$ of
recipients who receive messages in round $i$, the attacker now
collects observations $\V{o_i}$ of linkable classes
(e.g. pseudonyms) that receive in round $i$.  Since two distinct
senders don't produce messages in the same linkability class, the
elements of Alice's $\V{v}$ and the background $\V{u}$ are now
disjoint, and thus easier for the attacker to separate.

It's also possible that the partitioning may not be complete: sometimes many
senders will send messages in the same class. For example, two binary
documents written with the same version of MS Word are more likely to be
written by the same sender than two messages selected at
random.\footnote{Encrypting all messages end-to-end would address most of
  these attacks, but is difficult in practice.  Most recipients do not
  run anonymity software, and many don't support encrypted
  email. Thus, many messages still leave today's mix
  networks in plaintext. Furthermore, today's most popular encryption
  standards (such as PGP and SMIME) have enough variation for an attacker to
  tell which implementations could have generated a given message.}

%Linkages may be more abstract: a
%sophisticated attacker could check for the presence of certain
%keywords and try to link messages based on their textual content.


To exploit these scenarios, the attacker
chooses a set of $c$ partitioning classes (such as languages or
patterns of use), and assigns each observed output
message a probability of belonging to each class.  Instead of collecting
observation
vectors with elements corresponding to recipients, the attacker now
collects observation vectors whose elements correspond to number of
messages received by each
$\left<\mbox{recipient},\mbox{class}\right>$ tuple.  (If a message
might belong to multiple classes, the attacker sets the corresponding
element of each possible class to the probability of the message's
being in that class.)
The attack
proceeds as before, but messages that fall in different
classes no longer provide cover for one another.

%\XXXX{ This attack seems not to exploit the fact that if Alice sends
%  to $<$Word6, Bob$>$, she is more likely to send to $<$Word6,
%  Carol$>$ than to $<$Word2K, Carol$>$.}  -NM

\subsubsection{Exploiting {\it a priori} suspicion:}
%\label{subsubsec:suspicion}
Finally, the attacker may have reason to believe that some messages
are more likely to have been sent by the target user than others.  For
example, if we believe that Alice
studies psychology but not astrophysics, then we will
naturally suspect that a message about psychology
is more likely to come from Alice than is a message
about astrophysics.  Similarly, if users have different views of the network,
then an attacker will suspect messages exiting from mixes Alice probably
doesn't know about less than other messages.

To exploit this knowledge, an attacker can (as suggested in the
original statistical disclosure paper)
modify the estimated probabilities in $\V{o_i}$ of Alice having sent
each delivered message.
% For example, if 100 messages are sent in a
%round, and the attacker judges that 50 of them are twice as likely to
%have been sent by Alice than the other 50, the attacker assigns
%$2/150$ to each element of $\V{o_i}$ corresponding to a likely
%message, and $1/150$ to each element corresponding to an unlikely
%message.

%======================================================================
\section{Simulation results}
\label{sec:simulation}
In Section~\ref{subsec:broadening}, we repeatedly claim that each
complication of the sender or the
network forces the attacker to gather more information. But how much?

To find out, we ran a series of simulations of our attacks, first against the
model of the original statistical disclosure attack, then against more
sophisticated models.  We describe our simulations and present results below.

\subsubsection{The original statistical disclosure attack:}
%trial1
%We simulated Danezis's original statistical disclosure attack,
Our simulation
varied the parameters $N$ (the number of recipients), $m$ (the number
of Alice's recipients), and $b$ (the batch size).  The simulated ``Alice''
sends a single message every round to one of her recipients, chosen uniformly
at random.  The simulated background sends to $b-1$ additional recipients per
round, also chosen uniformly at random.  We ran 100 trial attacks for each
chosen $\left<N,m,b\right>$ tuple.  Each attack was set to halt when the
attacker had correctly identified Alice's recipients, or when 1,000,000
rounds had passed.  (We imposed this cap to keep our simulator
from getting stuck on hopeless cases.)

%\begin{figure}[ht]
%\centering
%\mbox{\epsfig{angle=0,figure=graphs/fig1,width=4in}}
%\caption{Statistical disclosure model: Median rounds to guess all recipients}
%\label{fig1}
%\end{figure}

%\begin{figure}[ht]
%\centering
%\mbox{\epsfig{angle=0,figure=graphs/fig2a,width=4in}}
%\caption{Unknown background: Median rounds to guess all recipients}
%\label{fig2a}
%\end{figure}

Figure~\ref{fig1} presents the results of our simulation
(the low-$m$ curves are at the bottom).
As expected, the attack
becomes more effective when Alice sends messages to only a few
recipients (small $m$); when there are more recipients to whom Alice does not
send (large $N$); or when batch sizes are small (small $b$).

\begin{figure}
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig1,width=6cm}}
\caption{Statistical disclosure model: median rounds to guess all recipients}
\label{fig1}
\end{minipage}
\hfill
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig2a,width=6cm}}
\caption{Unknown background: median rounds to guess all recipients}
\label{fig2a}
\end{minipage}
\hfill
\end{figure}

\subsubsection{Complex sender behavior and unknown background traffic:}
% trial2
The next simulation examines the consequences of a more complex model for
background traffic, and of several related models for Alice's behavior.

We model the background as a graph of $N$ communicating parties, each of whom
communicates with some of the others.  We build this graph according to the
``scale-free'' model \cite{scale-emergence,scale-analysis}.
Scale-free networks share the ``six degrees of
separation property'' (for arbitrary values of six) of small-world
networks \cite{small-world}, but also mimic the clustering and `organic'
growth of real social
networks, including citations in journals, co-stars in IMDB, and links in
the WWW.
%
%%Should I include this? -NM
%%If we run low on space, I would take out the first half of it, and
%% merge the second half with the above paragraph. -RD
%The scale-free model works as follows: we begin with a small number of
%interconnected vertices (in our case, 2).  Then, we iteratively add vertices
%to our graph, connecting each new vertex to every older vertex $i$ with
%probability $c_i/|E|$, where $c_i$ is the number of existing connections to
%vertex $i$, and $|E|$ is the total number of edges in the graph.  Our
%simulation continues until it has generated $N$ connected vertices.
For these trial attacks, the background messages were generated by
choosing nodes from the graph with probability proportional to their
connectedness.  This simulates a case where users send messages
with equal frequency and choose recipients uniformly from among the people
they know.

We simulated trial attacks for different values of $N$ (number of
recipients) and $m$ (number of Alice's recipients).
Instead of sending one message per batch, however, Alice now
sends messages according to a geometric distribution with parameter
$P_{M}$ (such that Alice sends $n$ messages with probability $P_m(n) =
(1-P_M)P_M^n$).  We tried two methods for assigning Alice's
recipients: In the `uniform'
model, Alice's recipients are chosen according to their connectedness (so
that Alice, like everyone else, is likelier to know well-known people)
but Alice still sends to her recipients with equal
probability.  In the `weighted'  model, not only are Alice's recipients
chosen according to their connectedness, but Alice also sends to them
proportionally to their connectedness.  We selected these
models to examine the attack's effectiveness against users who
behave with the same model as other users', and against users who mimic
the background distribution.

The results are in Figure~\ref{fig2a}, along with the results for the
original statistical disclosure attack as reference.  As expected, the attack
succeeds easily, and finishes faster against uniform senders than weighted
senders for equivalent values of $\left<N,m,b\right>$.
Interestingly, the attack against uniform senders is {\it faster} than the
original statistical disclosure attack---because the background traffic is
now clustered about popular recipients, Alice's recipients stand out
more. %than they did before.
%XXX say more? -NM

\subsubsection{Attacking pool mixes and mix network:}
\label{subsec:sim-complex-mixes}
%trials 3,4
Pooling slows an attacker by increasing the number of output messages
that could correspond to each input message.  To simulate an attack against
pool mixes and mix networks, we abstract away the actual pooling rule used by the
network, and instead assume that the network has reached a steady state, so
that each mix retains the messages in its pool with the same probability
($\Pdelay$) every
round.  We also assume that all senders choose paths of exactly the same
length.

Unlike before, `rounds' are now determined not by a batch
mix receiving a fixed number $b$ of messages, but by the passage of a fixed
interval of time.  Thus, the number of messages sent by the background is no
longer a fixed $b-n_a$ (where $n_a$ is the number of messages Alice sends),
but now follows a normal distribution with mean $BG=125$ and standard
deviation
set arbitrarily to $BG/10$.\footnote{It's hard to determine
  standard deviations for actual message volumes on the deployed remailer
  network: automatic reliability checkers that send messages to themselves
  (``pingers'') contribute to a false sense of uniformity, while some users
  generate volume spikes by sending enormous fragmented files, or maliciously
  flooding discussion groups and remailer nodes. Neither group blends
  well with the other senders.}

\begin{figure}[ht]
\centering
\mbox{\epsfig{angle=0,figure=graphs/fig34,width=4in}}
\caption{Pool mixes and mix networks: median rounds to guess all recipients}
\label{fig34}
\end{figure}

To examine the effect of pool parameters, we fixed $m$ at $32$ and $N$ at
$2^{16}$, and had Alice use the `uniform' model discussed above.  The results
of these simulations are presented in Figure~\ref{fig34}.  Lines running off
the top of the graph represent cases in which fewer than half of the attacks
converged upon Alice's recipients within 1,000,000 rounds, and so no median
could be found.

From these results, we see that increased variability in message delay slows the
attack by increasing the number of output messages that may correspond to any
input message from Alice, effectively `spreading' each message across several
output rounds.  More interestingly, pooling is most effective at especially
high or especially low volumes of traffic from Alice: the `spreading' effect
here makes it especially hard for the attacker to discern rounds that
contain messages from Alice when she sends few messages, or to discern
rounds that don't contain Alice's messages when she sends many messages.

\subsubsection{The impact of dummy traffic:}
%\label{subsec:sim-dummies}
%trials 5a,5b
Several proposals exist for using dummy messages to frustrate traffic
analysis.  Although several of them have been examined in the context of
low-latency systems \cite{defensive-dropping}, little work has been done to
examine their effectiveness against long-term intersection attacks.

First, we choose to restrict our examination (due to time constraints) to the
effects of dummy messages in several cases of the pool-mix/mix network
simulation
above.  Because we are interested in learning how well dummies thwart
analysis, we choose cases where, in the absence of dummies, the attacker had
little trouble in learning Alice's recipients.

% Are there better citations for these strategies?  Are there better names?
Our first padding strategy (``independent geometric padding'')
is based on the algorithm used in current versions of Mixmaster:
Alice generates a random number of dummy messages in
each round according to a geometric distribution with parameter $\Pjunk$,
independent of her number of real messages.

This strategy slows the attack, but does not {\it necessarily} stop it.  As
shown in Figure~\ref{fig5a}, independent geometric padding is most helpful
when the mix network has a higher variability in message delay to
`spread' the padding between rounds.  Otherwise, Alice must send far more
padding messages to confuse the attacker.

\begin{figure}
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig5a,width=6cm}}
\caption{Independent geometric padding: median rounds to guess all
  recipients}
\label{fig5a}
\end{minipage}
\hfill
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig5c,width=6cm}}
\caption{Imperfect threshold padding: median rounds to guess all recipients}
\label{fig5c}
\end{minipage}
\hfill
\end{figure}

Our second padding strategy
(``imperfect threshold padding'') assumes that Alice attempts to implement
the otherwise unbreakable threshold padding strategy (always send $M$
messages total
in every round, adding dummies up to $M$ and delaying messages after $M$ as
necessary), but that she is only sometimes online (with probability
$\Ponline$), and cannot
send real messages or padding when she is offline.  (This
will be typical for most real-world users attempting to implement
threshold padding in a world of unreliable hardware and network
connections.)

Figure~\ref{fig5c} shows the result of imperfect threshold padding. As
before, Alice benefits most from padding in networks with more variable
delays.  Interestingly, in the low delay-variability cases (short paths, low
$\Pdelay$), padding does not thwart the attack even when Alice is online
$99\%$ of the time.

For our final dummy traffic simulation, we assume that Alice performs
threshold padding consistently, but that the attacker has had a chance to
acquire a view of the network's background behavior before Alice first came
online.\footnote{As usual, we assume that the background traffic patterns are
  unchanging.  If background traffic changes significantly over time, Alice
  can defeat this attack by joining the network, sending nothing but padding
  until the network's background characteristics have changed on their own,
  and only then beginning to send her messages.}  Here, our goal was to
confirm our earlier suspicion that padding helps not by disguising how many
messages Alice sends, but by preventing the attacker from learning how the
network acts in Alice's absence.

Figure~\ref{fig5d} compares results when Alice uses consistent threshold
padding and the attacker knows the background to results when Alice does not
pad and the background $\V{u}$ is unknown.  Not only can an attacker who
knows the background distribution identify Alice's recipients with ease,
regardless of whether she uses padding, but such an attacker is {\it not}
delayed by increased variability in message delays.

\begin{figure}
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig5d,width=6cm}}
\caption{Full padding, background known: median rounds to guess all recipients}
\label{fig5d}
\end{minipage}
\hfill
\begin{minipage}[t]{5.75cm}
\mbox{\epsfig{angle=0,figure=graphs/fig6,width=6cm}}
\caption{Partial observation: median rounds to guess all recipients}
\label{fig6}
\end{minipage}
\hfill
\end{figure}

\subsubsection{The impact of partial observation:}
%\label{subsec:sim-partial}
%trial 6
Finally, we examine the degree to which a non-global adversary can
mount a statistical disclosure attack.

% (have we defined 'entry'?) -NM
Clearly, if Alice chooses only from a fixed set of entry and exit mixes as
suggested by \cite{wright03}, and the attacker is watching none of
her chosen mixes, the attack will fail---and conversely, if the attacker is
watching all of her chosen mixes, the attack proceeds as before.  For our
simulation, therefore, we assume that all senders (including Alice) choose
all mixes as entry and exit points with equal probability for each message,
and that the attacker is watching some fraction $f$ of the mixes.  We
simulate this by revealing each message entering or leaving the network to
the attacker with probability $\Pobserve=f$.  The attacker sees a message
when it enters {\it and} when it exits with probability $({\Pobserve})^2$.

The results in Figure~\ref{fig6} show that the attacker can still implement a
long-term intersection attack even when he is only observing part of the
network.  When most of the network is observed ($\Pobserve>70\%$ in our
results), the attack is hardly impaired at all.  As more of the network is
concealed ($.4<\Pobserve<.7$) the attack becomes progressively
harder. Finally, as $\Pobserve$ approaches $0$, the required number of
rounds approaches infinity.

%======================================================================
\section{Conclusions}
\label{sec:conclusion}
Our results demonstrate that long-term end-to-end intersection attacks
can succeed even in the presence of complicating factors.
Here we
suggest several open questions for future work, and offer recommendations
for mix network designs.

\subsubsection{A more realistic model:}
%\label{subsubsec:future-work}
%Many questions remain before the effectiveness of long-term
%intersection attacks can be considered a closed problem.
Our model differs from reality in five major ways.

First,
although real social networks behave more like scale-free networks than like
the original disclosure attack's model, our {\bf models for user behavior} still
have room for improvement. Real users do not send
messages with a time-invariant geometric distribution: most people's email
habits are based on a 24-hour day, and a 7-day week.  Early research on
traffic patterns in actual mix networks \cite{mixvreliable} suggests that this
variation is probably significant.

Second, {\bf real user behavior changes over
time}. Section~\ref{subsec:strenghtening} discusses how an attacker might
handle a scenario where the background traffic changes slowly over
time, and perhaps a similar approach would also help against a sender whose
recipients were not constant.  In the absence of a model for time-variant
user behavior, however, we have not simulated attacks for these cases.

Third, it seems clear that systems with {\bf message linkability},
such as pseudonymous
services, will fall to intersection attacks far faster than anonymizing
services without linkability.  How linkable are messages ``in the wild,'' how
much does this linkability help an attacker, and how can it be mitigated?

Fourth, real attackers are not limited to passive observation. We should
generalize our attacks
to incorporate information gained by an {\bf active attacker}.  Past work on
avoiding blending attacks~\cite{trickle02}
has concentrated on preventing an attacker from being certain of
Alice's recipients---but in fact, an active attack that only reveals
slight probabilities could speed up the attacks in this
paper.

Fifth, Alice has incentive to {\bf operate a mix}, so an attacker
cannot be sure if she is originating messages or just relaying
them~\cite{econymics}. Can we treat this relayed traffic (which goes to
actual recipients) as equivalent to padding (which goes to no recipients)?
Can Alice employ this relayed traffic for a cheaper padding
regime, without opening herself up to influence from active attacks?

% also: knock down nodes, convince Alice to be vulnerable.

% Is this attack better or worse than other attacks?  Probably neither:
% this attack speeds up blending attacks, and relaxes the amount of
% information that those attacks require to succeed.
%
%   Should we explain this someplace? -NM

\subsubsection{Other questions for future research:}
Our analysis has focused on the impact of Alice's actions on Alice alone.
How do Alice's actions (for example, choice of padding method) affect other
users in the system? Are there incentive-compatible strategies that provide
good security for all users?

%There are other possible approaches to thwarting traffic analysis, including
%alternative padding regimes (as mentioned above in the discussion for
%Figure~\ref{fig5a}).  These should be investigated.

It would be beneficial to find closed-form equations for expected number of
rounds required to mount these attacks, as Danezis does for statistical
disclosure.

Many of our simulations found ``sweet spots'' for settings such as mix pool
delay, message volume, padding volume, and so on.  Identifying those points
of optimality in the wild would be of great practical help for users.
Systems could perhaps then be designed to adaptively configure their pooling
strategies to optimize their users' anonymity.

% impact on reputation systems

% better mix algorithms

\subsubsection{Implications for mix network design:}
%\label{subsubsec:implications}
%What steps
%should mix-net designers take to frustrate intersection attacks?

First, {\bf high variability} in message delays is essential.  By `spreading'
the effects of each incoming message over several output rounds, variability
in delay increases each message's anonymity set, and amplifies the effect of
padding.

{\bf Padding} seems to slow traffic analysis, especially when the padding is
consistent enough to prevent the attacker from gaining a picture of the
network in Alice's absence.  On the other hand, significant padding volumes
may be too cumbersome for most users, and perfect consistency (sending
padding from the moment a network goes online until it shuts
down) is likely impractical.

Users should be educated about the effects of {\bf message volume}: sending
infrequently is relatively safe, especially if the user doesn't repeat the
same traffic pattern for long.

%The threat of non-global observers must not be ignored.
Mix networks should take steps to {\bf minimize the proportion of observed
 messages}
that a limited attacker can see entering and exiting the network.  Possible
approaches include encouraging users to run their own mixes; choosing
messages' entry and exit points to cross geographical and organization
boundaries; and (of course) increasing the number of mixes in the network.

Much threat analysis for high-latency mix networks has aimed to provide
perfect protection against an eavesdropper watching the entire network.
But unless we adopt an unacceptable level of resource demands, it seems that
some highly distinguishable senders will fall quickly, and many ordinary
senders will fall more slowly, to long-term intersection attacks.
We must stop asking whether our anonymity designs can forever defend every
conceivable sender.
%     against a global passive adversary.
%Instead, we must ask _how long_
%you can defend _which senders_ against an adversary who sees _how much_.
%We show that mix networks are not secure against this global observer,
%and that they can also be defeated by partial observers.
Instead, we should attempt to quantify the risk: {\it how long} our
designs can defend
{\it which senders} against an adversary who sees {\it how much}.

% We said that fixed entry/exit might help too, but I now think it
% wouldn't.  Suppose the attacker observes c nodes out of n.  If I
% choose random paths, the attacker sees (c/n)^2 of my traffic with
% probability 1.  If I choose a fixed entry, the attacker sees c/n
% of my traffic with probability c/n.  No real difference.
%
% In fact, a limited attacker (P_observe=.2) with a diffuse target should
% _hope_ that people choose fixed entries.  If they do, then he can
% eventually make the intersection attack work against the ones who use him
% as their fixed entry: he breaks 20% of the senders.
% But if they _don't_ choose fixed entries, he only
% sees 4% of everyone's traffic, which is not enough to break anybody.
%
%
% Fixed entries are a good idea for low-latency systems, when a single
% connection with an attacker on each end compromises a sender--receiver
% link.  With high-latency systems, however, the number of observed
% entry/exit pairs matters: so being _certainly_ very seldom observed can
% be better than being _possibly_ observed somewhat seldom.
%

\section*{Acknowledgments}
Thanks go to Gerald Britton, Geoffrey Goodell, Novalis, Pete St. Onge, Peter
Palfrader, Alistair Riddoch, and Mike Taylor for letting us run our
simulations on their computers; to Peter Palfrader for helping us with
information on the properties of the Mixmaster network; and to George Danezis
for his comments on drafts of this paper.

%======================================================================
\bibliographystyle{plain} \bibliography{e2e-traffic}

\end{document}

% 'In order to' -> 'to'
% very -> damn -> ''
% mix network, not mix-net, not mixnet. 
