\documentclass{llncs}

\usepackage{url}
\usepackage{amsmath}
\usepackage{epsfig}

\renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
\newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
\newcommand\XXXX[1]{{\small\bf [XXXX #1]}}

\newcommand{\workingnote}[1]{}        % The version that hides the note.
%\newcommand{\workingnote}[1]{(**#1)}   % The version that makes the note visible.

\hyphenation{mix-net mix-nets}

\newenvironment{tightlist}{\begin{list}{$\bullet$}{
  \setlength{\itemsep}{0mm}
    \setlength{\parsep}{0mm}
    %  \setlength{\labelsep}{0mm}
    %  \setlength{\labelwidth}{0mm}
    %  \setlength{\topsep}{0mm}
    }}{\end{list}}

% Cut down on whitespace above and below figures displayed at head/foot of
% page.
\setlength{\textfloatsep}{3mm}
% Cut down on whitespace above and below figures displayed in middle of page
\setlength{\intextsep}{3mm}

\begin{document}
%\title{The Disadvantages of Cascade Mix Networks and How to Overcome Them}
%\title{Synchronous Batching:\\From Cascade Networks to Free Routes
\title{Synchronous Batching:\\From Cascades to Free Routes}
%\title{Mixing in Step:\\Synchronous Batching from Cascade Networks to
%  Free Routes
%\title{Groove Mixing:\\Synchronous Batching from Cascade Networks to Free Routes
%\title{Synchronous Batching:\\Beat Mixing from Cascade Networks to Free Routes
%\title{Heartbeat Mixing: Synchronous Batching from Cascade Networks to Free Routes
%\title{Synchronous Mixnets: From Cascades to Free Routes
%\thanks{Portions of this paper were inspired by discussions with David
%Hopwood. We invite him to be an author, but we have not been able
%to contact him. We'll keep trying.}}
%but we have been unable to contact
%him since beginning the paper. We'll keep trying.}}

\author{Roger Dingledine\inst{1} \and Vitaly Shmatikov\inst{2} \and Paul Syverson\inst{3}}
\institute{The Free Haven Project \email{(arma@freehaven.net)} \and
SRI International \email{(shmat@csl.sri.com)} \and
Naval Research Lab \email{(syverson@itd.nrl.navy.mil)}}

\maketitle
\pagestyle{empty}
%======================================================================
\begin{abstract}

%Tradeoffs between
%Choosing between
%Comparing
%various 
The variety of possible anonymity network topologies has spurred much
debate in recent years.
%Topology in anonymity systems has been hotly disputed in recent years.
%Synchronous mixnets are vastly superior to asynchronous mixnets in
%many ways.
In a synchronous batching design, each batch of messages enters the
mix network together, and the messages proceed in lockstep through
the network. We show that a synchronous batching strategy can be used
in various topologies, including a free-route network, in which senders
choose paths freely, and a cascade network, in which senders choose from
a set of fixed paths. We show that free-route topologies can provide
better anonymity as well as better message reliability in the event of
partial network failure.

\end{abstract}
%======================================================================
%\begin{quotation}
%\emph{I saw the best minds of my generation destroyed by madness...}
%\end{quotation}
%\hspace*{\fill}--Allen Ginsberg in ``Howl''

\section{Introduction}
\label{sec:intro}

Modern deployed mix networks, including Mixmaster~\cite{mixmaster-spec} and
its successor Mixminion~\cite{minion-design}, are subject to partitioning
attacks: a passive adversary can observe the network until a target
message happens to stand out from the others~\cite{disad-free-routes},
and an active adversary can manipulate the network to separate one
message from the others via blending attacks~\cite{trickle02}.
Berthold et al.~argue~\cite{disad-free-routes} that partitioning
opportunities arise because the networks use a \emph{free-route}
topology---one where the sender can choose the mixes that make up her
message's path. They suggest instead a \emph{cascade network} topology,
where all senders choose from a set of fixed paths through the mix network.

%In this paper we argue that the cascade design resolves the attacks not
%because of its network topology, but because of a property of its batching
%strategy. 
%We explore this \emph{synchronous batching} approach in Section 2.
%We find that this \emph{synchronous batching} approach prevents

In this paper we argue that the cascade design resolves these attacks
because it uses a \emph{synchronous batching} strategy, not because it
uses a particular network topology. We show that synchronous batching
prevents these attacks even when free routes are used. Further, we
explore three topologies with synchronous batching---cascades, stratified
(a restricted-route hybrid topology), and free-route---and find that
the free-route
network provides the highest expected anonymity as well as the best
robustness to node failure.

%and provides comparable latency to a cascade topology,

%\item It allows senders to verify message processing, enabling reputation
%systems such as those described in \cite{mix-acc,casc-rep}.
%\item It helps block blending attacks compared to asynchronous batching,
%because mixes cannot delay messages without permanently dropping them.

% * exit policies impose differentiation anyway. might as well make it
%   formally part of the topology.

In Section~\ref{sec:sync-batching} we describe the synchronous batching
model. Section~\ref{sec:related} relates previous work
to synchronous batching, including a response to each of the arguments
from~\cite{disad-free-routes}. Section~\ref{sec:scenarios} presents the
three topologies, and Section~\ref{sec:model} describes their entropy
(average anonymity the sender expects from the network). We use a model
checker to compute entropy for networks with 16 nodes:
we present our results and assess the assumptions behind them in
Section~\ref{sec:graphs}.
Section~\ref{sec:metrics} considers other metrics such as bandwidth
requirements, latency, and robustness.
%, and wrap up in Section~\ref{sec:conclusion}
%with future work and possible extensions.

\section{Synchronous batching}
\label{sec:sync-batching}

Chaum proposed hiding the correspondence between sender and recipient
by wrapping messages in layers of public-key cryptography, and relaying
them through a path composed of \emph{mixes}~\cite{chaum-mix}. Each mix
in turn decrypts, delays, and re-orders messages, before relaying them
toward their destinations.

A mixnet design groups messages into batches and chooses paths;
its design choices affect the degree of anonymity it can provide
\cite{trickle02}.
We might define ideal anonymity for a mixnet to be when an attacker can
gain no information (beyond prior knowledge) about the linkage between
messages entering and
leaving the network, other than that the maximum time between them is
equal to the maximum network latency.

This ideal is not achieved by protocols like Mixminion that use locally
computed random delays: if the maximum latency of such a network is $t$,
the probability that an output message corresponds to a particular
input message might be considerably higher than for other messages that
have entered over that time. (In principle, because of its pool mode,
a message's maximum latency could be infinite, but that's not a
significant improvement
in practice: if the probability of a given latency $t$ drops off
exponentially with $t$, then so does the probability that
a message leaving the network could have been sent that long
ago~\cite{Serj02}.)
Also, because Mixminion is both {\em asynchronous} (messages can enter
and leave the network at any time) and uses free routes, it is subject
to the attacks from~\cite{disad-free-routes} described in
Section~\ref{subsec:disad} below.

A network that uses \emph{synchronous batching}
has a fixed {\em batch period}, $t_\mathrm{batch}$, which
is related to the maximum desired latency, for example 3 hours.
Messages entering the network in each batch
period are queued until the beginning of the next period. They are
then sent through the mixnet synchronously, at a rate of one hop per
{\em hop period}. All paths are a fixed length $\ell$ hops, so that if
no messages are dropped, the messages introduced in a given batch will
progress through their routes in lockstep, and will all be transmitted
to their final destinations $\ell$ hop periods later. Each layer of a
message, once decrypted, specifies the hop period in which it must be
received, so that it cannot be delayed by an attacker.

The {\em width} $w$ of a mixnet using synchronous batching
is the number of nodes that simultaneously process messages from a given
batch in each hop
period. (If this is not constant, we can still talk about the maximum,
minimum, and mean width.) When $w=1$, we have a cascade.
The latency is between $\ell t_\mathrm{hop}$ and $t_\mathrm{batch} +
\ell t_\mathrm{hop}$, depending on when the message is submitted.
We might set $t_\mathrm{hop} < t_\mathrm{batch}/\ell$,
so the latency is at most $2t_\mathrm{batch}$, independent of the
path length.  Thus the entire batch is processed and delivered before
the next batch enters the network. Under this constraint, we can give
nodes the maximum opportunity to make use of the available bandwidth,
and the best chance at delivery robustness, by setting $t_\mathrm{hop}
\simeq t_\mathrm{batch}/\ell$.

\section{Related work}
\label{sec:related}

\subsection{Synchronous batching timing model and protocol}

Dingledine et al.~present in~\cite{mix-acc} a mix network that uses
synchronous batching. We refer to that paper for a detailed discussion
of the timing model, how to handle loosely synchronized clocks, and the
step-by-step instructions for senders and mixes to use the network and
judge whether messages have arrived on time.

That paper also describes a receipt and witness system by which senders
and mixes can prove that a given mix failed to pass on or accept a given
message. These receipts allow a reputation system: senders can recognize
which nodes tend to drop messages, and avoid them in the future.

\subsection{The Disadvantages of Free Mix Routes}
\label{subsec:disad}

% This section is based on notes from David Hopwood. Will the real
% David Hopwood please come forward and become an author?

Berthold et al.\ argue~\cite{disad-free-routes} that cascades are safer
than free-route mix networks against a strong adversary who watches all
links and controls many of the mixes. We consider each of their attacks
below and find in each case that the arguments of~\cite{disad-free-routes}
do not apply if the free-route network is synchronous. Indeed, against
some of the attacks a free-route network is much stronger than
the cascade network.

{\bf{Position in mix route:}} This attack partitions messages that go
through a given honest node based on how many hops each message has
travelled so far. If the adversary owns all other nodes in the network,
he can distinguish messages at different positions in their path (say,
one has traversed two mixes already, and another has traversed three), and
thus learn the sender and recipient of each message. The authors note:
\emph{Eventually, a message is only unobservable in that group of messages
which have this mix on the same routing position.} But in the synchronous
design, that's not a problem because this group is large (if only one
mix is trustworthy, $1/w$ of all messages in the batch). They conclude:
\emph{If only one mix of a route is
trustworthy, then the achievable anonymity is distinctly lower in a mix
network compared to a synchronously working mix cascade.} The actual
conclusion should be: If only one mix of a route is trustworthy, then
the achievable anonymity for a given topology is distinctly lower in an
asynchronous mixnet than in a synchronous mixnet.

{\bf{Determining the next mix:}} An adversary owning most nodes
in the network can attack the honest mixes: he can link senders
to messages entering honest mixes, and he can link receivers to
messages exiting honest mixes. Thus the target messages will only
be mixing with other messages that enter the mix node at that time,
and not with other messages elsewhere in the network. Even if senders
use the same path for multiple messages, the authors point out that
\emph{the batches always generate different anonymity groups.} Again,
the important property is whether the network uses synchronous batching,
not whether it uses free routes. In a synchronous batching design,
all messages in a batch exit the network together after the last
hop, so messages cannot be partitioned based on when they enter or exit
the network. %\footnote{Free-route protocols where each sender
%fixes his path are in fact vulnerable to attacks (e.g. tagging
%attacks~\cite{disad-free-routes,raymond00}) that free-route protocols
%with randomly chosen paths can resist~\cite{minion-design}, but so
%long as they use synchronous batching, they are not vulnerable to this
%particular attack.}

% XXX I think we mean 'unlinkability' here? -RD
{\bf{Probability of Unobservability:}} The authors explain that the
cascade topology optimizes for the case that only one mix node is
honest. They compare a 4-node cascade (with 3 compromised nodes) to
a 20-node free-route mix network (with 75\% compromised nodes), and
find that whereas
the cascade provides complete protection, a user choosing four
nodes in the free-route network has a non-trivial chance of picking an
entirely compromised path. But this is a false comparison. A better
comparison would consider either a free-route mix network with 4 nodes,
or a network of five $\ell=4$ cascades---so
the cascade network also has a chance of fully-compromised paths.
In Section~\ref{sec:graphs} we show that while each cascade in a cascade
network of width $w$ only mixes $1/w$ of the messages from the batch,
a free-route network can mix all the messages from the batch and thus
achieves significantly stronger anonymity even with 75\% compromised
nodes.

{\bf{Active Attacks:}} The authors discuss an active attack called a
trickle attack~\cite{babel}, wherein the adversary prevents legitimate
messages from entering the batch, or removes some messages from the
batch, so he can more easily trace Alice's message. To make the attack
less overt, he can send his own messages into the batch, or replace
the messages already in the batch with his own messages. These attacks
where the adversary \emph{blends} his messages with Alice's message
threaten both synchronous-batching and asynchronous-batching networks in
all topologies, and a complete solution that is practical is not
known~\cite{trickle02}.
The authors of~\cite{disad-free-routes} present some approaches to
mitigating this attack in a cascade environment, but a variety of other
approaches have been developed that also work in a free-route
environment. We discuss them next. Other active attacks are described
in Section~\ref{subsec:anonymity-robustness}.

\subsection{Blending attacks}
\label{subsec:blending}

Active attacks where the adversary targets a message by manipulating
the other messages in the system are a widespread problem in mix-based
systems. Solutions fall into three categories: attempts to \emph{prevent}
the attack, attempts to \emph{slow} the attack, and attempts to
\emph{detect and punish} the attacker.

One prevention technique requires each sender to acquire a \emph{ticket}
for each mix in his path before joining a given batch (the senders
receive blinded tickets~\cite{chaum-blind} so the mixes cannot
trivially link them to their messages). Mixes ensure their messages
come from distinct senders, so Alice can expect good mixing at each
honest node in her path~\cite{web-mix}. For cascades this approach is
clearly efficient because Alice only needs tickets for her chosen
cascade~\cite{disad-free-routes}, but her anonymity set is still
limited to that one cascade. We conjecture that other topologies
can give equivalent anonymity while only obtaining tickets from a
fraction of the mixes, but we leave that analysis to future work. A
bigger problem with the ticket scheme, however, is the feasibility
of requiring all users to register with the mixes: it is hard to
imagine that attackers can be excluded from being registered in an
open network~\cite{sybil}. Other prevention techniques use complex
cryptography to provide \emph{robustness}~\cite{flash-mix} --- messages
are only delivered if a threshold of the mixes agree that the batch has
been properly processed.

Techniques to slow the blending attack are generally designed for
asynchronous mix networks. In Mixmaster
and Mixminion, the goal of the batching algorithm is to hide from
the adversary when an outgoing message entered the mix. Mixes `pool'
some messages from previous batches, to try to mix them as far back
as possible. These approaches force the adversary to spend more time
and messages on the attack~\cite{trickle02}. Some designs allow a
pool mix to commit to its choice of randomness to allow verifying
its behavior~\cite{FGJP98}. Link encryption, as well as
% \cite{Jer2000}
Babel's \emph{inter-mix detours}~\cite{babel} and early Onion
Routing's \emph{loose routing}~\cite{goldschlag96}, aim to block a
limited adversary from knowing when his message has exited a mix.
This also complicates blending because even the sender cannot always
recognize a message he created.  In stop-and-go
mixes~\cite{stop-and-go}, each sender specifies a time window for each
mix in his path: as with synchronous batching designs, messages arriving
outside the time window are dropped, so the attacker cannot arbitrarily
delay messages without destroying them.

Other approaches aim to detect and deter misbehavior. Chaum suggests
allowing each sender to examine the output
of each mix~\cite{chaum-mix}, but this approach scales poorly. Danezis
and Sassaman propose a `heartbeat' dummy scheme~\cite{danezis:wpes2003}
for asynchronous pool mix networks: dummies are sent from a node in the
network back to itself, creating an early warning system to detect if
the adversary is launching a blending attack.
%(but this approach does not
%detect whether mixes drop messages directly from senders)
\emph{Reliability} mechanisms aim to improve a sender's long-term odds of
choosing a mix path with well-behaving nodes. The witness-and-receipt
system in~\cite{mix-acc} provides such a reputation system for
synchronous-batching networks. Another reputation system for
cascades~\cite{casc-rep} allows mixes to send test messages into
the network to detect misbehavior. Finally, randomized partial
checking~\cite{randomized-checking} allows each mix to show evidence
of its correctness by revealing a pseudo-randomly selected subset of
its input-output relationships, while the mix network as a whole still
protects linkability with high probability.

Clearly much work has been done to address blending attacks. Each
topology seems to have some plausible partial solutions.
%solutions apply to synchronous free-route networks
%as well as cascade networks.

\section{Threat model and mixnet topologies}
\label{sec:scenarios}

Our analysis considers a slight variant on the traditional
powerful adversary who observes globally and controls a fraction
of the nodes~\cite{disad-free-routes}. We assume the adversary
compromises nodes at random rather than in a targeted fashion (see
Section~\ref{subsec:random-adversary} for more discussion on this point).
Along with being able to control some of the nodes, our adversary can
observe messages from senders to the mixnet and from the mixnet to
receivers, but our initial analysis assumes he cannot observe the
links between honest nodes in
the mixnet (in Section~\ref{subsec:average-actual} we argue that
with high probability, observing these links will not yield much
new information anyway).
This paper only examines sender anonymity, though many of the advantages
of synchronous batching may carry over to receiver anonymity.

%An adversary is assumed to only watch messages entering and leaving
%the mixnet. This is admittedly weaker than an adversary that can
%observe some links inside the network as well as at the periphery.
%However, a hostile node can see more than an adversary that observes
%that node's network connections. In this sense our model subsumes such
%a threat provided that the link-observing adversary is not global.

We assume that selective forwarding will be discovered, and either the
attack will be prevented or the malfunctioning node will be removed
from the network (see Section~\ref{subsec:blending}). We address the
attack of intersecting individual batches in Section~\ref{subsec:disad}
(under ``determining the next mix''), but unsurprisingly, we leave the
long-term intersection attack~\cite{langos02,statistical-disclosure}
unsolved. Further active attacks to degrade anonymity are described in
Section~\ref{subsec:anonymity-robustness}.

We analyze a 16 node mixnet where all messages follow a four node path.
%(We do say a few things about a 16 node
%path free route.)
%We use the same number for all mixnets to
%have a reasonably fair comparison of available resources.
Besides being a tractable size for analysis, 16 nodes also
approximates deployed mixnets. (Mixminion currently
has between 20 and 30 active nodes.) One might argue that a 4 node mixnet
gives better security, because all messages are mixed together
in any topology. We assume a larger network is needed because 1) the
bandwidth of a single node may be insufficient to handle all the traffic;
2) a single path may not include as many choices for jurisdiction as
some users want; and 3) a single path is not very robust, either to
network attacks or to nature.

Messages proceed through the network in \emph{layers}; all the
nodes in a layer process messages of one mixnet batch at the same time.
In general we describe networks as $w$x$\ell$, where $w$ is the number
of nodes at each layer and $\ell$ is the number of nodes in a path.
We consider three basic topologies: a 4x4 cascade mixnet in which all
messages pass through four cascades of length four; a 4x4
\emph{stratified} mixnet, in which all messages pass through four
layers of disjoint nodes such that messages may pass from any node at
one layer to any node at the next layer; and a 16x4 free-route mixnet,
in which all nodes may receive messages at all layers. Note that
because free-route nodes are reused, `16x4' does not mean 64 nodes.
Examples of the three topologies are illustrated below.
%Examples of a 2x2 cascade mixnet, a 2x2 stratified mixnet, and a 4x2
%free-route mixnet are illustrated in Figures~\ref{fig:casc-2x2},
%\ref{fig:sa-2x2}, and \ref{fig:free-4x2}, respectively.

\begin{figure}
\begin{minipage}[h]{3.5cm}
\mbox{\epsfig{angle=0,figure=casc-2x2,width=3.5cm}}
\caption{A 2x2 cascade mix network (4 nodes)}
\label{fig:casc-2x2}
\end{minipage}
\hfill
\begin{minipage}[h]{3.5cm}
\mbox{\epsfig{angle=0,figure=sa-2x2,width=3.5cm}}
\caption{A 2x2 stratified network (4 nodes)}
\label{fig:sa-2x2}
\end{minipage}
\hfill
\begin{minipage}[h]{3.5cm}
\mbox{\epsfig{angle=0,figure=free-4x2,width=3.75cm,height=4.5cm}}
\caption{A 4x2 free-route mix network (4 nodes)}
\label{fig:free-4x2}
\end{minipage}
\hfill
\end{figure}

\input{model}

\section{Graphs and Analysis}
\label{sec:graphs}

\begin{figure}[t]
\unitlength=1in
\centering
\begin{picture}(5.0,2.0)
\put(2.5,.75){\makebox(0,0)[c]{\epsfig{angle=270,figure=badnodes,width=5in}}}
\end{picture}
%\mbox{\epsfig{angle=270,figure=badnodes,width=4in}}
\caption{Entropy vs probability of compromise for each node (16 nodes)}
\label{fig:badnodes}
\end{figure}

Figure~\ref{fig:badnodes} shows the entropy Alice can expect from
each of the three topologies.
The cascade network immediately divides the incoming batch by the
number of cascades, so it provides substantially less protection even
with many compromised nodes. The stratified topology provides about
the same expected entropy as the free-route topology.
In this section and the next we will examine other metrics for deciding
which is best. Further graphs in Appendix~\ref{sec:entropy-per-hop}
indicate how much entropy is achieved after a given number of steps
through each network.

\subsection{Is the adversary really randomly distributed?}
\label{subsec:random-adversary}

To keep our model tractable, we have assumed that each node has an
equal chance of being controlled by the adversary. A real adversary
might prefer to control certain key nodes in the topology.
To justify our assumption, we might assume that secure nodes (or
equivalently, vulnerable nodes) are randomly distributed. That is,
rather than letting the adversary have his pick of nodes, we instead
let the adversary control all the machines that have some security
vulnerability. A related approach would be to place particularly secure
and trusted (or at least jurisdictionally separate) nodes in key places
in the topology: if such nodes are discouragingly secure, they are no
longer an appealing target.

Alternatively, the mixes can periodically generate a communally random
seed to reorganize the network~\cite{casc-rep}. Thus,
being able to control or sign up a node does not allow the adversary to
dictate its position in the topology. This may be a satisfactory solution,
though it is not a complete solution because not all nodes are equal:
e.g.~nodes that refuse to deliver messages to the final recipients
shouldn't be chosen as exit nodes, so they may be less appealing targets.

\subsection{Choosing the same node twice in a row}

Conventional wisdom (see e.g.~\cite{minion-design}) suggests that in
a free-route network, Alice should never pick the same node twice in
a row: it increases her risk of picking only bad
nodes. We find that for a sufficiently large network, this increased
complexity in path selection has little impact on Alice's entropy.

Intuitively, when the adversary density is low, entropy will be high
in either case; whereas when most nodes are owned by the adversary,
the difference between picking between $B$ and $B-1$ bad nodes is slight.

More formally, for $G$ good nodes and $B$ bad nodes, the
chance of selecting a bad node next is $\frac{B-1}{G+B}$ if
the current node is bad and $\frac{B}{G+B}$ otherwise.
The difference is only $\frac{1}{G+B}$: it does not depend on what
fraction of the nodes are bad. 
%As the network size grows,
%the shift in probability distribution becomes negligible.
Specifically, for a 16x4 free-route mixnet (8 bad nodes), it's a 5.1\%
chance of an all bad path if a node cannot be picked twice in a row,
and 6.3\% chance if it can. With 32x4, it's 5.7\% vs.\ 6.3\%.

% For a 16x4 free-route mixnet with
%4 bad nodes, the difference amounts to a 2\% chance of an all bad
%path if the same node cannot be picked twice in a row and a 3.9\% chance
%if it can. With 8 bad nodes, this becomes 5.1\% and 6.3\% respectively.
%As the percent of bad nodes grows, the shift in probability becomes
%more negligible. This is also true as the network size grows.
%For example, for a 32x4 mixnet with half bad nodes, the difference
%is 5.7\% vs.\ 6.3\% . At what point this difference can safely
%be ignored is a question that can only be answered in context.
%We will assume it to be of no practical significance for the
%remainder of the paper.

% Mention that we have that entropy different for each of \ell hops.
% But it's just \ell times the above negligible difference, right? -RD

% Also, talk a bit about path-selection-algs where no node is repeated
% *ever* in the path. Would that help? It certainly seems that it would
% reduce entropy in scenarios with low adversary density, because he
% knows she doesn't exit from her entry node. Hm. -RD

\subsection{Reputations and Node Preferences}

Most deployed systems let users choose a preferred entry or exit hop,
e.g.~based on trust. A skewed distribution of messages only at the entry
or exit of the network should not impact entropy too much---we see from
Figures \ref{fig:caschops}-\ref{fig:freehops} that much of each network's
entropy is achieved from just a few hops.

Reputation systems, on the other hand, encourage users to prefer certain
nodes at \emph{each layer} of the network. Further, reputation
information can be exploited by an adversary to reduce anonymity,
for example by predicting the user's behavior based on reputation
statistics, or by attracting more traffic by building a strong reputation
or degrading the reputation of others. Placing nodes with similar reputation
in the same layer of a stratified network, or placing them
in the same cascade, might complicate these attacks, but employed naively,
this can facilitate other attacks~\cite{casc-rep}. This
topic merits further investigation.


\subsection{Average Entropy vs Actual Entropy}
\label{subsec:average-actual}

The graphs and analysis above are for average entropy---the network's
behavior for very large batches. But in reality the batch size
may be quite small, and each sender chooses paths independently from the
others. We must consider the possible variance in entropy depending
on the actual path choices.

For $m$ messages to $u$ buckets (nodes in a layer), we find the chance
that any bucket will have less than $p$ messages based on
Maxwell-Boltzmann statistics and inclusion-exclusion:

\begin{eqnarray*}
{u \choose 1} \sum_{i=0}^p (\frac1{u})^i (1-\frac1{u})^{m-i} {m \choose i}
- {u \choose 2} \sum_{i=0}^p \sum_{j=0}^p
  (\frac1{u})^i (\frac1{u})^j (1-\frac{2}{u})^{m-i-j} {m \choose i,j} \\
+ {u \choose 3} \sum_{i=0}^p \sum_{j=0}^p \sum_{k=0}^p
  (\frac1{u})^i (\frac1{u})^j (\frac1{u})^k (1-\frac{3}{u})^{m-i-j-k} {m \choose i,j,k}
- \dots
\end{eqnarray*}

For $m=128$ messages and $u=4$ nodes (i.e.~cascade or stratified network),
the chance of any node getting less than 16 messages (compared to
the 32 we expect each to get) is $6 \cdot 10^{-4}$---meaning with very high
probability the average entropy represents the behavior we will see
in reality. However, for $u=16$ nodes (free-route), 48\% of
the time some node will get less than half the expected number; and it is
not until a batch size of 480 that this metric reaches 1\%.

This result makes sense: each link on a free-route network has a smaller
expected number of messages, so variations have a bigger impact. Whether
it is acceptable depends on a number of factors. First, how large do we
expect batches to be in reality? The Mixmaster network receives more than
1000 messages an hour, which seems plenty sufficient.
Second, how bad is it when a link varies by half the expected volume? If
we change our metric to require at least 2 messages on each link, then for
$m=128$ we find that only 1\% of the cases fall outside this value.
Another significant question is how the number of layers affects the
results: the more layers, the greater the chance that some of them are
well balanced. The exact relation and its effect on entropy are
open questions.

Danezis also considers this issue of variance from average entropy for his
mixnet design based on sparse expander graphs~\cite{danezis:pet2003}. He
argues that having at least one message on each link is sufficient for
basic protection, and he uses a similar approach to show that his design
achieves this distribution with high probability. He further raises the
idea of padding unused links to guarantee one message on each link, with
the aim of preventing trivial traffic analysis attacks. Is it worthwhile
to prevent this simple attack? Are all other attacks significantly
harder? Clearly more research remains.

\subsection{Flooding attacks to degrade anonymity or service}

In Section \ref{subsec:blending} we talk about techniques to discourage
a mix from dropping or substituting messages in the batch. But
what if the adversary simply submits \emph{more} messages to the batch?

It turns out that as long as $k$ of the $n$ input messages come from
honest senders, Alice will still be assured that she can expect entropy
based on a batch of $k$ messages. That is, assuming uniform distribution
of messages over mixes, the entropy of a baseline
network (all-honest senders) plus hostile messages is at least the entropy
of the baseline network by itself. This is different from the pooling
batching strategy~\cite{trickle02}, where messages from the adversary
will influence the behavior (and thus entropy) of Alice's message.

On the other hand, directed floods can overflow node capacity. We
might use techniques where mixes can prove that any output message was
derived from an input message, which reduces the problem to detecting
or stopping floods at the beginning of the batch. We might also argue
that the fraction of adversary messages in the batch limits the maximum
size of the flooding attack---honest messages will still be randomly
distributed. In general, this flooding issue is an unsolved problem for
all mixnet designs; more research remains.

\section{Other metrics for comparison}
\label{sec:metrics}

\subsection{Throughput, delay, capacity, bandwidth}

One parameter we cannot control is the rate that messages arrive to the
mixnet. Similarly, we cannot control the latency that users will be
willing to accept. To make the analysis more concrete, assume we choose
$\ell=4$, that users deliver 128 messages every 3 hours, and that users
will tolerate a latency of 3--6 hours (which is on par with the latency
experienced by a typical Mixmaster message, though
it could be much longer in theory).

We can compute the maximum flow rate (traffic in unit time) through
any given node. Assume that sending a message over a single hop
consumes a fixed amount of network traffic; we can then use that as
the unit for traffic. Let $T_\mathrm{batch}$ be the expected
throughput in a single batch period, i.e.~the number of messages that
go through the network in a batch. If the available nodes are used
optimally (see Section \ref{subsec:average-actual}), the flow rate
required through each node is $\frac{T_\mathrm{batch}}{w \cdot
  t_\mathrm{hop}} = \frac{\ell \cdot T_\mathrm{batch}}{w \cdot
  t_\mathrm{batch}}$.

%We could choose $t_\mathrm{batch} = t_\mathrm{hop}$, so that
%processing occurs in uniform systolic cycles:
%we let 32 messages in every 45 minutes.
%This is 42.7 messages/hour for a 4x4 mixnet,
%whether cascade or stratified.  In a 16x4 free-route mixnet the
%expected flow rate is the same because each node is processing a
%quarter as many messages per batch but processing four mixnet batches
%in the same period.\footnote{On first sight it also looks like multiple
%different batches are effectively being mixed together at each hop---
%but an adversary watching messages that enter and leave the
%mixnet can easily partition the batches.} Latency for Alice's message
%is between 3 hours and 3 hours 45 minutes,
%depending on when Alice's message arrives.
%Maximum possible entropy is 5 (3 for the 4x4 cascade).
%
% I really don't like this option. The entropy is going to suck,
% particularly wrt the free-route system, where you get an expected
% *two* messages into each node per batch. Even Vitaly can't claim
% the actual entropy will whp approximate the average entropy for that
% situation. :)
% I bring the issue up briefly below, but don't present the bad option
% as the first one the reader sees.

If we choose $t_\mathrm{batch} \simeq \ell t_\mathrm{hop}$, all messages
clear the mixnet before the next batch enters: we introduce a batch
of 128 messages every 3 hours. We get 42.7 messages/hour for all three
topologies. Latency is between 3 hours and 6 hours, depending on when
Alice's message arrives. By accepting messages over a large amount
of time, we get better expected entropy; make the actual behavior
of the network closer to the expected behavior of the network (as in
Section~\ref{subsec:average-actual}); and smooth spikes and troughs
in the rate of incoming messages.

In the free-route network, each node needs to process 8 messages at a
time and is active at each layer. The cascade and stratified networks
require a larger capacity from each node: they must handle 32
messages at once (128/$w$), but they are idle for all but one hop in the
batch. One could imagine a \emph{systolic} or \emph{pipelined} network
where $t_\mathrm{batch} = t_\mathrm{hop}$ and 32 messages are let in
every 45 minutes. In this
case the capacity of nodes in cascade and stratified networks would also
be 8, and indeed the latency could be cut to between 3 hours and 3 hours
45 minutes---but the expected entropy would be cut by a factor of $\ell$.

Bandwidth is acceptable. Assuming a higher load of 5000 messages per
batch, and 32KB per message (as in Mixminion), nodes in the free-route
system use less than 4KB/s (nodes in the other
topologies use 16KB/s but only 1/4 as often). That's well within
the capabilities of current Mixmaster nodes.

%If it were desired to
%have the maximum latency the same, it would be necessary to reduce
%$t_\mathrm{hop}$ to 15 minutes. This would correspondingly raise the
%flow rate per node by a factor of three. Note that the longer
%$t_\mathrm{batch}$ means more smoothing of spikes and troughs in
%$T_\mathrm{batch}$. Also, while it is still possible to coordinate
%messages within a batch to arrive at the same free-route node in the
%same period by choosing the layer appropriately, it is not possible to
%coordinate messages from multiple batches to flood a given node at the
%same time (as would be possible in the uniform systolic case).

%Note that synchronous batching does not necessarily mean a "one
%latency fits all" approach. People who want a larger anonymity set can
%agree to send their messages only at intervals of $k \cdot
%t_\mathrm{batch}$, so those batches will have more messages (but at
%the expense of the other batches). This implies a fair degree of coordination
%and the flip side is a freedom to flood the network to some extent.
%It may also attract attackers to such cabals so that the net anonymity
%does not increase.

\subsection{Robustness of Message Delivery}

Better entropy can be achieved by longer routes: e.g., if we
form our 16 nodes into a 1x16
cascade or a 16x16 free-route, there is almost no falloff in entropy
until each node has a ninety
percent chance of being compromised. But this ignores robustness of
message delivery. % (We discuss robustness of anonymity to active attacks
%below.)
For the free-route 16x16 mixnet with only a single node failure,
% (and for
%randomly chosen routes through this mixnet)
nearly two thirds of
messages will be undelivered (because they will need to pass through
it at some point).
% Such a network is very brittle.
The 1x16 cascade
is even worse: a single node crash blocks all message delivery.
(We might take advantage of schemes to bypass a single failed node
\cite{pfitzmann85}, but it's not clear how this works with the synchronous
approach in all topologies.)
Parallel cascades can be added to the network, but unlike the
free-route, they will \emph{a priori} reduce the entropy of an input
message for a given size mixnet batch. We must be sure to consider
robustness when comparing topologies.

\begin{table}
\renewcommand{\arraystretch}{1.3}
\begin{center}
\begin{tabular}[b]{| l | l || c | c | c | c |}

\hline
 & Topology & \ 1 crash \   & \ 2 crash \ & \ 3 crash \  & \ 4 crash \ \\
\hline

                         & 16x16 free    & 36   & 12   & 04   & 01      \\
Worst possible           & 4x4 cascade \ & 75   & 50   & 25   & 00      \\
adversary distribution \ & 4x4 stratif.  & 75   & 50   & 25   & 00      \\
                         & 16x4 free     & 77   & 59   & 44   & 32      \\
\hline
\hline
                         & 16x16 free    & 36   & 12  & 04   & 01      \\
Best possible            & 4x4 cascade   & 75   & 75  & 75   & 75      \\
adversary distribution   & 4x4 stratif.  & 75   & 56  & 42   & 32      \\
                         & 16x4 free     & 77   & 59  & 44   & 32      \\
\hline
\hline
                         & 16x16 free    & 36   & 12  & 04   & 01      \\
Expected percentage:     & 4x4 cascade   & 75   & 55  & 39   & 27      \\
rand.\ adversary dist.   & 4x4 stratif.  & 75   & 55  & 39   & 27      \\
                         & 16x4 free     & 77   & 59  & 44   & 32      \\

\hline

\end{tabular}
\caption{Percent of messages delivered vs number of crashed nodes}
\label{table:expected-delivery}
\end{center}
\end{table}

Table~\ref{table:expected-delivery} shows that
4x4 cascades and 4x4 stratified networks do roughly the same on
average, but this is for very different reasons. The chance that the
configuration will block all messages increases much more quickly for
cascades, but the maximum possible delivery of messages remains much
higher. This can be seen in the table reflecting the most favorable
adversary distribution for up to four node crashes. To further
illustrate, if half of the nodes are bad in the 4x4 cascade topology,
then in about 1 in 6 cases a quarter of the messages get through, and
in exactly 6 cases of 12870, half of the messages get through the
cascades. For all other distributions, no messages get through.  If
half of the nodes are bad in the 4x4 stratified network, then the
highest percentage of messages that can pass through is 6.25.
However, some messages will be passed in the majority of adversary
distributions.

Of the scenarios we have considered, a 16x4 free route has the best
expected chance of message delivery for random adversary distribution.
It outperforms the others, unless the adversary has a particularly
innocuous distribution. Cascades do better under favorable distributions,
which are also much rarer for cascades than other topologies.
Also note that the expected fraction of passed messages is the same
for free routes regardless of which nodes fail: it is the most robust
with respect to adversary distribution as well as adversary size.

\subsection{Robustness of Anonymity}
\label{subsec:anonymity-robustness}

Robustness of anonymity against active attacks is harder to determine,
as such attacks can take on a variety of forms. In the simplest case
though, we can consider the effect on anonymity of simple node
crash, since this is the most straightforward way to actively shrink
anonymity. Also, as discussed in Section~\ref{subsec:blending}, there
are techniques to detect and/or deter more selective attacks.

The threat model we consider here is an extension of the
one in Section~\ref{sec:scenarios}. As before, the adversary can watch
senders and receivers. But now, besides failing to mix, hostile nodes may
also crash---failing to deliver any of their input messages.  A
combination of active attacks and observations (including some
internal observations) should prove the most devastating to anonymity.
However, we leave full examination of this for future work. Here we
concentrate on the effect of such intentional crash failures on
entropy for a mixnet periphery observer.

Anonymity of cascades is unaffected by this threat model.  Since each
cascade batch is independent of the others, any node that crashes will wipe
out all the messages in that anonymity set.  Anonymity robustness of
stratified and free-route topologies is more complex.

For a stratified network, if any entry node fails, the number of
messages drops by one quarter, causing a reduction in entropy of .42.
If two entry nodes fail, the entropy drops by 1. If 3 entry nodes
fail, entropy drops by 2. If all fail, the adversary learns nothing more
than if none fail.
%entropy remains the same as if none fail.
If a second layer node fails, assuming a balanced
layer-two distribution, anonymity of all messages is unaffected since
there is no change to the probability that an exiting message was any
incoming message. Note this is so even if the distribution of messages
across entry nodes is highly skewed. If the layer-two distribution is
skewed, then a node may fail with some effect on entropy. However, the
ability to affect anonymity in this way should be very small for
randomly chosen routes.  Ignoring the small effect of such
non-entry-layer failures, we see that the anonymity of a stratified
network given node crashes is usually better and at worst equal
to that of the cascade topology.

Free routes are even more complex. For entry layer nodes, the initial
effect of each crash is clearly smaller. However, since nodes are
used at multiple layers, a message that reaches a crashed node at a
given layer could not have been routed through that node at any earlier
layer. Further, the attacker may gain additional information by crashing
nodes only at certain layers! Even worse, as the ratio of input messages to
width of a layer shrinks, it becomes more likely that nodes at a given
layer will only receive messages from a subset of nodes at the previous
layer or, in any case, that the layer distribution will be unbalanced
between nodes to a significant degree.

On the other hand, because nodes are recycled for use at multiple layers,
it is much harder to plan an attack. If nodes can't crash and then come
back in a single batch (perhaps it's hard to do undetectably), crashing an
entry node to reduce the anonymity of a message at another node may cause
that message to be blocked when it must traverse the crashed node at a
later layer.  But it will generally be hard to predict when to come up
beyond a few layers, because the targeted message will likely be coming
from any of the remaining nodes after that much mixing.

To get some handle on this complex situation, we will consider a very
lucky adversary.
The adversary controls a quarter of the nodes in a 16x4 recycling
free-route.  Suppose a message enters the mixnet at a node not under
adversary control, and the adversary crashes all of its nodes.
Messages drop by a quarter. If the layer-2 distribution is such that
the layer-1 node that received the target does not send any messages
to the four adversary nodes, they remain crashed. Assuming that a
quarter of the remaining messages are addressed to them at layer-2,
remaining messages are now .56 of the original batch. Repeat for
layer-3 and layer-4. Remaining messages are down to .32 of the
original mixnet batch. In this case, despite all the luck of the
adversary, the anonymity is thus still better than that of a message
sent into a cascade processing a quarter of the original mixnet batch.
% And
%this is also better than the anonymity of the stratified 4x4 network
%with just two hostilely crashed entry nodes.
%% Paul -- how are we comparing this? Isn't two easier than four?
%% I don't see it. And given that, it's not clear it's helpful here?
%% Though I guess it would be useful to say that free-route is better
%% than stratified, if we can compare accurately.

We have not considered all possible active attacks. But for those we
have considered, the best choice for anonymity robustness appears to be
the free route, and worst is the cascade. We invite further research.

\subsection{Comparison with Asynchronous Batching Designs}

We have shown synchronous free-routes can provide good anonymity, but we
must also begin comparing this design to more traditional asynchronous
free-route designs like Mixminion. Synchronous batching needs no replay
cache (each message is labeled with its batch), weakens partitioning
attacks from blending and key rotation, and generally provides clearer
anonymity guarantees.

On the other hand, because Mixminion's pool batching strategy
spreads out message distributions between batches, our
design may fall more quickly to long-term statistical disclosure
attacks~\cite{statistical-disclosure}. Our design is also less robust
to transient failures: a late Mixminion message still arrives, whereas
in our system a node that is down throughout $t_\mathrm{hop}$ loses all
messages going through it. (Stratified and cascade networks have the
lowest chance of being down in a hop period they are needed, but
free-route networks lose proportionally fewer messages from a single
down node.)  But our design can tell the user for sure whether his mail
was delivered in the batch (and he can resend if not), whereas Mixminion's
unpredictability always leaves the user wondering if it will come
out sometime.

Like stop-and-go mixes~\cite{stop-and-go}, we may be able to get improved
anonymity by allowing Alice to choose to delay her message at a given
hop until the next batch. That is, the node would delay her message by
$t_\mathrm{batch}$ and re-introduce it at the same point in the path.
If each message is either delayed once or not delayed, that gives us
a latency of 3 to 6 hours for non-delayed messages, 6 to 9 hours for
delayed messages, and a 6-hour anonymity set (unless the attacker knows
that someone never sends or receives delayed messages, in which case the
anonymity set for those users is still 3 hours; also, if the attacker
owns the node Alice chooses, he may be able to speculate about which
senders would choose to delay messages).
We leave further comparison to future work.

\section{Summary}
\label{sec:conclusion}

Previously, only cascade networks were considered secure against very
powerful adversaries~\cite{disad-free-routes}. In this paper we show that
other topologies can use the synchronous batching strategy to achieve
similar protection. Further, we show that free-route topologies with
synchronous batching compare favorably to cascade networks. We invite
further analysis of the trade-offs between each topology.

\section*{Acknowledgments}
We acknowledge David Hopwood for the ideas and arguments behind Sections
2 and 3; and we thank LiWu Chang, Camilla Fox, Rachel Greenstadt,
Chris Laas, Ira Moskowitz, and Itamar Shtull-Trauring for probability
discussions.

%======================================================================

\bibliographystyle{plain}
\bibliography{sync-batching}

\appendix

%\newpage
\section{Entropy vs number of hops, for each topology}
\label{sec:entropy-per-hop}

\begin{figure}[ht]
\centering
\mbox{\epsfig{angle=270,figure=caschops,width=6cm}}
\caption{Entropy vs number of hops, for cascade network (16 nodes)}
\label{fig:caschops}
\end{figure}

\begin{figure}
\begin{minipage}[ht]{5.75cm}
\mbox{\epsfig{angle=270,figure=systhops,width=6cm}}
\caption{Entropy vs number of hops, for stratified network (16 nodes)}
\label{fig:systhops}
\end{minipage}
\hfill
\begin{minipage}[ht]{5.75cm}
\mbox{\epsfig{angle=270,figure=freehops,width=6cm}}
\caption{Entropy vs number of hops, for free-route network (16 nodes)}
\label{fig:freehops}
\end{minipage}
\hfill
\end{figure}

\clearpage

\input{model-app}

\end{document}


