%% Accountability in MIX-nets from Free Haven
%% Info Hiding Workshop 2000
%% Draft was due Dec 12
%% Camera-ready copy for pre-proceedings due Mar 22
%% Proceedings (finished version) due May 24

\documentclass{llncs}
\usepackage{epsfig}

%\newif\ifpdf
%\ifx\pdfoutput\undefined
%   \pdffalse
%\else
%   \pdfoutput=1
%   \pdftrue
%\fi

\begin{document}
%% Use dvipdfm instead. --DH
%\ifpdf
%  \pdfcompresslevel=9
%  \pdfpagewidth=\the\paperwidth
%  \pdfpageheight=\the\paperheight
%\fi

\title{A Reputation System to Increase MIX-net Reliability}
\author{Roger Dingledine\inst{1}, Michael J. Freedman\inst{2},
        David Hopwood\inst{3}, David Molnar\inst{4}}
\institute{Reputation Technologies, Inc.
\email{(arma@reputation.com)}
\and
Massachusetts Institute of Technology
\email{(mfreed@mit.edu)}
\and
Independent consultant
\email{(david.hopwood@zetnet.co.uk)}
\and
Harvard University
\email{(dmolnar@hcs.harvard.edu)}}
\maketitle
\pagestyle{empty} 
  
\begin{abstract}

We describe a design for a reputation system that increases the
reliability and thus efficiency of remailer services. Our reputation
system uses a MIX-net in which MIXes give receipts for intermediate
messages. Together with a set of witnesses, these receipts allow
senders to verify the correctness of each MIX and prove misbehavior to
the witnesses.

\end{abstract}

\section{Introduction}

Anonymous remailers are the most common method of anonymous e-mail
communication. Despite wide use of the current global remailer
network, this network is generally considered unreliable. 
Messages are
often dropped, and the newsgroup {\tt alt.privacy.anon-server}
contains many examples of a message being sent two or three times in
the hope that one instance will reach the destination.  
% XXX the above sentence can be greatly compressed
This unreliability directly affects the number of people using the remailer
network and their traffic patterns, which 
reduces the anonymity these networks provide.

One approach to increasing reliability is to write
more reliable software \cite{RProcess}. Another approach is to build 
MIX protocols that give provable robustness guarantees
\cite{desmedt,flash-mix,mitkuro}. Our approach is to 
build a reputation system to track MIX reliability, and to modify the
MIX protocol to support it. Because users choose
paths based on the published
scores for each MIX, this reputation system improves both
reliability (fewer messages get routed through dead MIXes) and
efficiency (the system dynamically rebalances the load based on
available reliable resources).

Currently deployed remailer reputation systems (better known as
\emph{remailer statistics}) collect data independently and are treated as
trusted third parties by client software. 
Reliable statistics servers are good targets for adversaries.
Furthermore, these statistics often only measure secondary properties of the
MIX-net servers, such as up-time.
%XXX does the above sentence need to be there?

We describe related MIX-nets and current statistics systems in
Section \ref{sec:related}.
Section \ref{sec:mix-design}
presents a MIX-net design that allows the sender
of a message, along with a collection of weakly trusted third party
\emph{witnesses}, 
to prove that a MIX failed to process a message.
Section \ref{sec:reputation} introduces a new set of agents called 
\emph{scorers}, who tally failure
proofs and serve them to client software. Because each failure
can be proven to any number of scorers, loss of a scorer
will be less disruptive than loss of a statistics server in current
reputation systems.

We stress that this work does not fully solve
the problem of MIX reliability. 
In presenting a simple reputation system, we hope
to stimulate further research on
reputation systems and modelling the increased reliability they provide.

\section{Related Work} 
\label{sec:related}

\subsection{MIX-nets}

Chaum introduced the concept of a MIX-net for anonymous 
communications \cite{chaum-mix}.
A MIX-net consists of a series of servers,
called MIXes (or MIX nodes), each of which is associated with a public
key. Each MIX receives encrypted messages, which are then decrypted, batched,
their order permuted, and forwarded on after stripping the sender's
name and identifying information. Chaum also proved security of MIXes against
a \emph{passive adversary} who can eavesdrop on all communications between MIXes
but is unable to observe the permutation inside each MIX. That is,
the \emph{view} of a passive adversary gives negliglible 
advantage over guessing in linking messages with their senders and receivers.

Current research directions on MIX-nets include
``stop-and-go'' MIX-nets \cite{kesdogan}, distributed ``flash MIXes''
\cite{flash-mix} and their weaknesses \cite{desmedt,mitkuro}, and
hybrid MIXes \cite{hybrid-mix}.
Previous work primarily investigates the \emph{robustness}
of MIX-nets in the context of a distributed MIX system
\cite{flash-mix}. A MIX is considered robust if it survives the
failure of any $k$ of $n$ participating servers, for some threshold
$k$. This robustness is all or nothing: either $k$ servers are
good and the MIX works, or they are not good and the MIX likely will
not work.

Robustness has been achieved primarily via zero-knowledge proofs of
correct computation.
Jakobsson showed how to use precomputation to reduce the overhead of such a
MIX network to about 160 modular multiplications
per message per server \cite{flash-mix}, but the protocol was later
found to be flawed \cite{mitkuro} by Mitsumo and Kurosawa. 
Desmedt and Kurosawa's alternate approach \cite{desmedt}
requires many participating servers. Abe's MIX \cite{abe}
provides \emph{universal verifiability} in which any observer can
determine after the fact whether a MIX cheated, but 
the protocol is still computationally expensive.

Our notion of reliability differs from robustness in that we do not
try to ensure that messages are delivered even when some nodes fail.
Instead, we focus on improving a sender's long-term odds of choosing a
MIX path that avoids failing nodes.
We note that the two approaches can be composed: a distributed MIX with
robustness guarantees can be considered as a single node with its own
reputation in a larger MIX-net.

\subsection{Deployed Remailer Systems}

The first widespread public implementations of MIXes were produced by
the cypherpunks mailing list. These ``Type I'' \emph{anonymous
remailers} were inspired both by the
problems surrounding the {\tt anon.penet.fi} service
\cite{helsingius}, and by theoretical work on MIXes.
Hughes wrote the first cypherpunks anonymous remailer \cite{remailer-history};
Finney followed closely with a collection of scripts which used Phil
Zimmermann's PGP to encrypt and decrypt remailed messages.
Later, Cottrell implemented the Mixmaster system \cite{mixmaster}, or
``Type II'' remailers, which added message padding, message pools, and
other MIX features lacking in the cypherpunk remailers.  
At about the same time, Gulcu and Tsudik introduced the Babel system
\cite{babel}, which also created a practical remailer design (although
one that never saw widespread use).

\subsection{Remailer Statistics}

Levien's \emph{statistics pages} \cite{levien} track
both remailer capabilities (such as what
kinds of encryption the remailer supports) and remailer
\emph{up-times}, observed by pinging the machines in question and by
sending test messages through each machine or group of machines.
Such \emph{reputation systems} improve
the reliability of MIX-nets by allowing users to avoid choosing
unreliable MIXes.  The Jack B Nymble 2 remailer client \cite{potato}
allows users to import statistics files and can then pick remailers
according to that data. Users can specify minimum reliability scores,
decide that a remailer should always or never be used, and specify
maximum latency.

\subsection{Three Approaches to MIX-Net Reliability} 

We can build protocols which give
specific guarantees of robustness and reliability, and prove
theorems about these protocols in suitably specified adversary models.
These results may be quite strong, e.g. ``this distributed MIX
delivers correctly if no more than half of its participating servers
are corrupt,'' 
yet the resulting
protocols may be complicated and inefficient.
%  Converging on the ``right'' protocol may be
%tricky; even a small misstep in a proof may invalidate all robustness
%results.
 
Instead of engineering the
MIX-net protocol directly to provide reliability, we can make use of
reputations to track MIX performance. In this approach, we specify a
set of behaviors which characterize a ``good'' or ``bad'' MIX.
Unlike the
reliability via protocol approach, we are not likely to prove strong theorems.
The goal of reputation is to make the system
``better'' without guaranteeing perfection.
 
A third option is to create economic
incentives for MIXes to stay reliable, or ensure that an adversary
who wishes to compromise reliability must spend a large amount of
resources. Currently, Zero-Knowledge Systems is exploring this
approach by paying MIXes for consistent reliability and
performance in the Freedom network \cite{freedom2}.
%% Cite Back? -DM XXX
% Preliminary reports suggest that this has not significantly helped the
% reliability of the Freedom network\cite{back}. 
Another variant on this approach
is to introduce the use of payments for each message sent through a
MIX.

Reliability via protocol is the most well-studied approach, while
reliability via reputations in the form of Levien statistics is the
most widely used. Our work combines the two approaches:
we modify the MIX-net protocol to support easy
tallying of MIX failures and then specify a suitable reputation system.

\section{A MIX-net With Witnessed Failures}
\label{sec:mix-design}

Verification of transactions in cryptographic protocols is commonly supported
either by performing the transaction publicly, or by using digitally signed
receipts.  We use the latter notion to develop a design in which MIXes
provide a receipt for each message they receive. 
%The receipt represents an agreement to process the message.

The sender Alice can query a MIX to see if it has proof that it attempted
to follow the protocol. But
the MIX might refuse to provide Alice a receipt for a particular message
either because it failed to send the message, or because it was
unable to obtain a receipt from the next hop.
We solve the problem of pinpointing failures as follows:
each message has a {\em deadline} by which it must be sent to the next
MIX. A MIX $N_i$ first tries to send a message 
directly to the next node, $N_{i+1}$. If $N_i$ has not received a valid
receipt by a specified period before the deadline, it enlists several
{\em witnesses}, who will each independently try to send the message
and obtain a receipt. Each witness that obtains a
valid receipt sends it back to $N_i$. Any witness that
doesn't will be convinced that $N_{i+1}$ is unreliable, and
provides $N_i$ with a signed {\em statement} to that effect.

\begin{figure}[htc]
\begin{center}
\leavevmode
\epsfbox{msg-flow2.1}
\caption{Message flow example}
\end{center}
\end{figure}

Thus for each message a MIX sends, it will either have
a receipt showing that it processed the message in the time allowed, or
statements signed by any witnesses it chooses, asserting that the next
MIX did not follow the protocol. 
We will describe a protocol which uses these receipts and statements to
allow the sender of a failed message to demonstrate that a particular
node on the path failed.

The above argument does not explicitly consider the possibility that
several consecutive MIXes are controlled by an adversary; we cover that
in Section \ref{subsec:proving}.
%XXX can we remove the above paragraph, since we cover it elsewhere?

While witnesses must be trusted to respond to requests, 
they need not be trusted to preserve anonymity,
since the messages sent to them are no more than the view of a passive
adversary with complete network access.
Therefore, the proofs of sender and receiver unlinkability
that apply to traditional MIX-net protocols still hold.

\subsection{Cryptographic Algorithms}

%% (semantically secure => non-malleable) under CCA2, so no need to say that.

We require a public-key encryption scheme, which
should be semantically secure under adaptive chosen ciphertext attack,
%(equivalent to IND-CCA2 in the definitions of \cite{pk-relations})\footnote{
%As is common in MIX-net protocols, the encryption scheme should
%pad plaintexts to a constant length on decryption. The IND-CCA2
%notion may need to be modified slightly to take this into account,
%but how to do that is beyond the scope of this paper.
and a public-key
signature scheme, which should be existentially unforgeable under adaptive
chosen message attack.

The encryption scheme is modelled as a key pair generation
algorithm $G_E$, a randomized encryption algorithm $E$, and a
deterministic decryption algorithm $D$. 
The encryption notation explicitly includes the random value:
$E_i^r(M)$ means the encryption of message $M$ and random value $r$
under the public key of $N_i$. We assume 
that if $N_i$'s key pair is valid (i.e. was generated by $G_E$),
$D_i(E_i^r(M)) = M$ for any plaintext $M$ and random value $r$.

The signature scheme is modelled as a key pair generation
algorithm $G_S$, a signing algorithm $Sign$, and a verification
algorithm $Ver$.
The notation $Sign_i(M)$ means the signature of $M$ under
the private key of $N_i$, and $Ver_i(Sig, M) = 1$ if $Sig$ is a valid
signature by $N_i$ on $M$, or $0$ otherwise.
%If $N_i$'s key pair is
%valid (i.e. was generated by $G_S$), then $Ver_i(Sign_i(M), M) = 1$
%for any message $M$.

We also assume that authentic, distinct encryption and verification
public keys for each MIX are known to all parties. Message recipients
are treated as MIXes.

\subsection{Overall MIX-net design}

Alice wants to send Bob a message anonymously. She chooses a path
through the network consisting of $k-1$ MIXes, $N_1 \dots N_{k-1}$.
Alice repeatedly
``onion'' encrypts her message, and sends the onion to the first
MIX in her path. That MIX returns a receipt, processes the onion, and
passes the unwrapped-by-one-layer onion to the next MIX,
which repeats these steps.
If the message does not reach Bob, the transaction has failed.
(Section \ref{subsec:end-to-end} shows how to use ``end-to-end receipts''
to ensure that Alice knows when this has occurred.)

Our system should be able to identify the MIX 
that caused the failure:

\begin{itemize}
\item{Goal 1, {\bf Identify failure}:} If $N_i$ fails to pass on a
well-formed message within the allowed time to the next node $N_{i+1}$,
then Alice can prove to any third party that $N_i$ was the failing MIX
(the completeness property).

\item{Goal 2, {\bf Reject false claims}:} No participant (including
Alice) can claim that $N_i$ failed to pass on a well-formed message to
$N_{i+1}$ in time, unless it really did fail (the soundness property).
\end{itemize}

\subsection{Timing Model}

We consider time to be split into periods, each of length one time unit,
corresponding to batches of messages sent by MIXes.
A message received by $N_i$ in the period ending at time $t$ will
have timestamp $t$
on the corresponding receipt signed by $N_i$. If $N_i$ is not the final
recipient, the message must be sent to
$N_{i+1}$ in the next period.
That is, the {\em deadline} for sending it to $N_{i+1}$ is $t+1$.

All parties (MIXes, witnesses, senders of messages, and verifiers of
failure claims) have clocks that are loosely synchronized (within
$T_{\epsilon}$ of the global time). Sending a message over
a network is assumed to take $T_{comm}$ in the worst case. Therefore,
a party needing to send a message by time $t$
should send it before $t - T_{\epsilon} - T_{comm}$ according
to its local clock, because its local clock may be slow, and because
it must allow time for the network communication. 
A party expecting a message by time $t$ should
allow it to be received as late as $t + T_{\epsilon}$ according to its
local clock, since its clock may be fast.

The protocol requires several system-wide time constants:

$T_{response}$ is the time that a MIX is allowed
between receiving a message and providing a receipt.

$T_{margin}$ is the length of time before the deadline at
which a MIX will attempt to send a message via the witnesses,
if it has not been able to obtain a receipt directly from the next node.

$T_{retain}$ is the time (number of periods) for which receipts
are retained by MIXes. That is, if a MIX is asked for a receipt in
the period ending at $t$, it need only provide it 
if the receipt has timestamp
$t - T_{retain}$ or later. This value determines the length of time
for which failure claims can be directly verified.

\begin{figure}[htc]
\begin{center}
\leavevmode
\epsfbox{timing.1}
\caption{Timing example}
\end{center}
\end{figure}

Figure 2 depicts an example time line (showing only global time):

\begin{itemize}
\item time A is the point at which $N_i$ stops trying to send to $N_{i+1}$
      directly.
\item time B is when a witness $w$ tries to contact $N_{i+1}$.
\item time C is the latest time at which $N_{i+1}$ can respond to $w$
      with a receipt (this could also be before the deadline t+1).
\end{itemize}

%Unfortunately the timing constraints, although important to the
%correctness of the protocol, tend to obscure the description of how
%it works. You may wish to ignore them on an initial reading.

\subsection{Transmitting a Message}

This section describes the protocol for transmitting a message from
Alice to Bob.

\subsubsection{Procedure {\tt Transmit}$(Alice, Bob, Plaintext)$:}

\begin{enumerate}
\item Alice chooses $k-1$ MIXes $N_1$, $\dots$ $N_{k-1}$ to form a ``MIX
path''. Let $N_0$ be Alice, and let $N_k$ be Bob.
\item Alice picks $k$ random seed values $r_1, r_2, \dots r_k$.
\item Alice creates an initial packet $M_1$, defined as

\begin{displaymath}
M_1 = E_1^{r_1}(N_2, E_2^{r_2}(N_3, \dots E_{k-1}^{r_{k-1}}(Bob,
E_k^{r_k}(Plaintext))\dots))
\end{displaymath}

\item Let $now$ be Alice's current local time, and
      let $Deadline_1 = \lceil now \rceil$.
\item Try to send $M_1$ and $Deadline_1$ directly to $N_1$, waiting
      for a receipt.
\item If a receipt $Rcpt$ is received, check that\newline
      $Ver_{dest}(Rcpt, $``Receipt: $M_1, Deadline_1$''$) = 1$; if so,
      stop.
\item If no receipt is returned, set $Deadline_1 := Deadline_1 + 1$, and
      use the procedure {\tt Hop-send}$(N_1, M_1, Deadline_1)$ below
      to resend $M_1$ to $N_1$.
\end{enumerate}

If all parties follow the protocol, the message will then be processed by
$N_1, \dots N_{k-1}$ as follows:

\begin{itemize}
\item $N_1$ reads $M_1$ from Alice, and processes it according to the
      procedure\newline
      {\tt Hop-receive}$(Alice, N_1, M_1, Deadline_1)$. 
     
\item $N_1$ decrypts $M_1$ to give $(N_2, M_2)$, where

\begin{displaymath}
M_2 = E_2^{r_2}(N_3, \dots E_{k-1}^{r_{k-1}}(Bob,
E_k^{r_k}(Plaintext))\dots)
\end{displaymath}

\item Let $Deadline_2 = Deadline_1 + 1$.
\item $N_1$ uses the procedure
      {\tt Hop-send}$(N_2, M_2, Deadline_2)$ to send $M_2$ to $N_2$.
\item This process is repeated by $N_2$, which sends $M_3$ to $N_3$, and
      so on for $N_3, N_4, \dots N_{k-1}$.
\item Eventually, $E_k^{r_k}(Plaintext)$ is sent to Bob. Bob can decrypt
      this message, so the plaintext has successfully
      been transmitted.
\end{itemize}

\subsubsection{Procedure {\tt Hop-send}$(N_{dest}, Message, Deadline)$:}

\begin{enumerate}
\item Try to send $Message$ and $Deadline$ to $N_{dest}$ directly, waiting
      for a receipt.
\item If a receipt $Rcpt$ is received, check that\newline
      $Ver_{dest}(Rcpt, $``Receipt: $Message, Deadline$''$) = 1$.

\item If a valid receipt is not received before $Deadline - T_{margin} - T_{\epsilon}$
      (by the sending node's local clock),
      \begin{enumerate}
      \item Let $W$ be a set of witnesses.
      \item Send ``Witness: $N_{dest}, Message, Deadline$'' to each $w
            \in W$ (causing {\tt Witness} to be called on each
            $w$). Wait for any $w$ to send back a receipt.
      \item If a receipt $Rcpt$ is received, check that\newline
            $Ver_{dest}(Rcpt, $``Receipt: $Message, Deadline$''$) = 1$.
      \item If no valid receipt is received, store any statements returned
            by the witnesses.
      \end{enumerate}
\end{enumerate}

Note that an imposter witness or MIX may send fake receipts to the sender in order to
try to confuse it.  The sender should ignore any receipts that are not
valid. If the sender receives more than one valid receipt, it need only
store one (chosen arbitrarily).

\subsubsection{Procedure {\tt Hop-receive}$(N_{src}, N_{dest}, Message, Deadline)$:}

\begin{enumerate}
\item Let $now$ be the current time.
\item If $now > Deadline + T_{\epsilon}$ or $now < Deadline - 1 - T_{\epsilon}$,
      drop the message and respond with an error.
%XXX similarly, can condense 'now'?
\item Otherwise, decrypt $Message$, and queue for transmission in the next period
      by {\tt Hop-send}.
\item Send back the receipt $Sign_{dest}($``Receipt: $Message, Deadline$''$)$
      to $N_{src}$.
\end{enumerate}

\subsubsection{Procedure {\tt Witness}$(N_{src}, N_{dest}, Message, Deadline)$:} 

\begin{enumerate}
\item Let $now$ be the current time.
\item (The witness must be sure that $N_{dest}$ has time to respond:)
      If $now > Deadline - T_{comm} - T_{\epsilon}$ or $now < Deadline - 1 + T_{\epsilon}$,
      drop the message and respond with an error.
%XXX similarly, can condense 'now'?
\item Try to send $Message$ to $N_{dest}$ directly, waiting for a
      receipt. 
\item If a receipt $Rcpt$ is received, check that\newline
      $Ver_{dest}(Rcpt, $``Receipt: $Message, Deadline$''$) = 1$.
      If so, send it back to $N_{src}$.
\item If a valid receipt is not received before $Deadline + T_{response} + T_{\epsilon}$,
      $N_{dest}$ has failed; send back a statement
      ``Failed: $N_{dest}, Message, Deadline$'', signed by the witness, to
      $N_{src}$.
\end{enumerate}

\subsection{Identifying and Proving Failed Transactions}
\label{subsec:proving}

%This section describes the method by which Alice proves the failure of a
%specific MIX to deliver a message, as well as the protection our
%protocol offers against a malicious Alice who either sends ill-formed
%messages or provides false claims.

Alice can prove
to any chosen verifier (call him Victor)
a claim that a MIX failed to deliver a message.
Section \ref{sec:reputation} describes Victor's duty in our reputation system.

Suppose that $N_i$ ($1 \leq i < k$) is the first node on the path
that does not follow the protocol, i.e., it fails to handle a message $M_i$
that $N_{i-1}$ sends to it, within the allowed time. Alice wants to prove
to Victor that $N_i$ failed to process this message (which she might
discover by a binary search over her MIX path).

Because $N_{i-1}$ behaved according to protocol, it will have a receipt
$Rcpt_i$ for the message $M_i$ with deadline $Deadline_i$.
As Alice knows all the random padding values, she can compute
the message that $N_i$ is supposed to send to $N_{i+1}$:

\begin{displaymath}
(N_{i+1}, M_{i+1}) = D_i(M_i) = (N_{i+1}, E_{i+1}^{r_{i+1}}(\dots E_{k-1}^{r_{k-1}}(Bob, E_k^{r_k}(Plaintext))\dots))
\end{displaymath}

She suspects that $M_{i+1}$ was not sent to $N_{i+1}$, so she
prepares a claim by obtaining the $Rcpt_i$ from $N_{i-1}$, and calculating
$M_{i+1}$ as above. Then Alice sends
``I blame $N_i$, claim: $r_i, N_{i+1}, M_{i+1}, Deadline_i, Rcpt_i$'',
which Victor verifies:

\subsubsection{Procedure {\tt Verify-claim}$(N_i, r_i, N_{i+1}, M_{i+1}, Deadline_i, Rcpt_i)$:} 

\begin{enumerate}
\item Check that $N_i$ and $N_{i+1}$ refer to valid MIXes, and $M_{i+1}$
      is the correct length for an intermediate message.
\item Calculate $M_i = E_i^{r_i}(N_{i+1}, M_{i+1})$, and check that\newline
      $Ver_i(Rcpt_i, $``Receipt: $M_i, Deadline_i$''$) = 1$.
\item Let $now$ be the current time. If $Deadline_i + 1 < now - T_{retain} + T_{\epsilon}$, then
      it is too late for the claim to be verified, since $N_i$ may legitimately
      have discarded its receipt; the claim is therefore rejected.
\item Send ``Receipt request: $N_{i+1}, M_{i+1}$'' to $N_i$.
\item If ``Receipt response: $Rcpt_{i+1}, Timestamp$'' is received
      from $N_i$ within time $T_{response}$, such that
      $Ver_{i+1}(Rcpt_{i+1}, $``Receipt: $M_{i+1}, Timestamp$''$) = 1$\newline
      and $Timestamp \leq Deadline_i + 1$,
      then reject the claim.
\item If statements ``Failed: $N_{i+1}, M_{i+1}, Timestamp$'' signed by
      a sufficient set of witnesses are received, for some
      $Timestamp \leq Deadline_i + 1$, conclude that $N_i$ made a
      reasonable attempt to send the message, and reject the claim.
      (See Section \ref{subsec:witness-trust} for discussion of
      trust requirements on witnesses.)
\item Otherwise, conclude that $N_i$ failed -- either because
      it did not process the original message, or because it did not
      respond to the receipt request.
\end{enumerate}

\subsubsection{Proving that a delivery failure results in a proper claim:}

$N_{i+1}$ will only give out a receipt for which
$Ver_{i+1}(Rcpt_{i+1}, $``Receipt: $M_{i+1}, Deadline_i + 1$''$) = 1$
if it received the message $M_{i+1}$ by time $Deadline_i + 1$. If it did
not receive $M_{i+1}$ by then, assuming $N_{i+1}$'s
signatures cannot be forged, $N_i$ will not be able to provide such a
signature. Thus, Victor will conclude that it failed.

Node $N_{i+1}$ might be controlled by the adversary; in this
case, it might provide a receipt on $M_{i+1}$ in order to exonerate
$N_i$, even though $M_i$ was not actually processed. However, Alice
can then use this receipt in the same protocol to attempt to prove
that $N_{i+1}$ failed.  Therefore, the adversary will be able to
choose which of the contiguous MIXes it controls can be proven
unreliable.  However, since there are only $k-2$ other nodes on the
path, Alice will be able to prove that some node failed, after at most
$k-1$ iterations of the protocol.

Because Alice can always prove that some MIX failed if it did
in fact fail, we satisfy Goal 1, being able to identify failures in the
MIX-net.

We note that knowledge of the fact that $N_i$ and $N_{i+1}$ are part
of Alice's chosen path is \emph{not} part of the view of a passive
adversary in a normal MIX-net protocol. The closer $N_{i+1}$ is to the
end of the path, the more likely it is that this additional
information will allow an adversary to link Alice with Bob.
To avoid giving away this information,
Alice may send all the messages needed
for the failure proof over the MIX-net. When requesting receipts she
can include a reply block, so that the node will be able to send back
the receipt while Alice remains anonymous.

It may seem circular to rely on the MIX-net to send messages needed
for a failure proof, when the goal of the proof protocol is to improve MIX-net
reliability.
However, since these messages are only used to prove failure
and do not convey any other information, it is not critical that all of
them are successfully transmitted. Alice can repeat any attempts to obtain
receipts and to send the claim message to Victor as often as necessary, using
random independent paths for each attempt. This will succeed quickly provided
the probability of a message making it through the MIX-net (from Alice)
is not too small.
%% Make this more precise. Can we handwave that if the system is not 'too'
%% unreliable initially, its reliability stabilises at a high level?
%% We do that later, so no need to do it here.
%%%XXX actually, we don't do that anymore? should we briefly here? -RD

%% Does the extra traffic over a short period pose a significant risk to
%% Alice's anonymity? No, I don't think it does, unless the mix-net is
%% broken for other reasons. --DH

\subsubsection{Proving that false claims are rejected:}

We wish to show that no participant can claim that
$N_i$ failed to pass on a well-formed message sent by Alice
to $N_{i+1}$, unless it really did fail to send such a message.
Without loss of generality, we will consider the adversary to
be Alice.

Recall that Alice's claim to Victor is of the form
``I blame $N_i$, claim: $r_i, N_{i+1},$ $M_{i+1}, Deadline_i, Rcpt_i$''.
Victor then calculates $M_i = E_i^{r_i}(N_{i+1}, M_{i+1})$.
Decrypting both sides, we obtain $D_i(M_i) = (N_{i+1}, M_{i+1})$,
assuming $N_i$'s key pair is valid.\footnote{
We can assume that $N_i$'s public key is valid because it is chosen
by $N_i$, who could only hurt its own reputation via an invalid key.}

% We
%omit consideration of an $N_i$ controlled by Alice that purposely
%chooses an invalid key, because then Alice only damages the reputation
%of a MIX she controls.}

\begin{itemize}
\item Suppose Alice caused the message $M_i$ to be sent to $N_i$ by
time $Deadline_i$, and then tried to claim that $N_i$ failed. 
A well-behaving $N_i$ will decrypt $M_i$ to give the next hop and
intermediate message $(N_{i+1}, M_{i+1})$, and try to
send $M_{i+1}$ to $N_{i+1}$ using {\tt Hop-send}. Either $N_i$
will obtain a receipt for this message (signed by $N_{i+1}$ and
having the timestamp $Deadline_i + 1$), that refutes Alice's claim,
or it will have signed statements from a sufficient set of
witnesses (see Section \ref{subsec:witness-trust}) saying that
$N_{i+1}$ refused to provide a receipt when it was obliged to do
so. In either case, $N_i$ will be exonerated.

\item Suppose Alice did not cause $M_i$ to be sent to $N_i$ by
$Deadline_i$.\newline
In order to make a credible claim, Alice needs a
receipt $Rcpt_i$ such that $Ver_i(Rcpt_i, $``Receipt: $M_i, Deadline_i$''$) = 1$
(since Victor will check this).
However, if $N_i$ did not receive $M_i$, it will not have given out
any such receipt, and so assuming that $N_i$'s signatures cannot
be forged,\footnote{
We take the position that if $N_i$'s private key has been compromised,
it should be considered to have failed.}
it will be exonerated.
\end{itemize}

A node could also be falsely accused of failure if
there are ``sufficient'' witness statements against it as defined in Section
\ref{subsec:witness-trust}. We assume that this does not occur, because
the adversary is not able to subvert witnesses in the core group.

Therefore, we satisfy Goal 2. Note that the above argument covers the
case in which Alice attempts to send messages to $N_i$ that either do
not decrypt, or decrypt to ill-formed plaintexts, since we have proven
that for Alice's claim to be accepted, $N_i$ must have received a message
with a well-formed $(N_{i+1}, M_{i+1})$ pair as the plaintext.

Many MIX-net protocols require MIXes to refuse to pass on a message if it
is a {\em replay} --- a message the MIX had already received
within a given time period (see \cite{babel} for a rationale). We prevent
MIXes from losing reputation because of this behavior as follows.
When a MIX receives
a replayed message (using {\tt Hop-receive} as normal), it will already
have a receipt from the previous time it sent that message (or else
witness statements showing that it tried to send it). Steps 5 and 6 of
{\tt Verify-claim} show that it can
provide the earlier receipt or statements when challenged with
a failure claim. We
define the length of time for which replays are remembered to be the same
as $T_{retain}$.
A MIX that receives a replayed message should delay the expiration of
the earlier receipt or statements for a further $T_{retain}$ periods.

%Note that a message should be considered a replay if its plaintext has
%been seen before, not just its ciphertext. This is because there may
%be many ciphertexts that decrypt to a single plaintext. If these were
%not considered replays, Alice would be able to prepare two ciphertexts
%decrypting to the same plaintext, and send both to $N_i$.
%% On second thoughts I don't think this causes a problem. All that
%% happens is that $N_{i+1}$ receives a replay, but it should still
%% send back a receipt, so $N_i$ is OK (and $N_{i+1}$ is OK because it
%% has its earlier receipt from $N_{i+2}$). --DH

\subsection{End-to-End Receipts}
\label{subsec:end-to-end}

If Alice is posting to a public forum like Usenet, she (and any verifier) can 
see whether the message was posted correctly. Otherwise,
if the MIX-net supports reply blocks or another
method for two-way communication, 
Bob's software can immediately send an {\em end-to-end receipt} back
through the MIX-net to Alice.
%Note that reply blocks used for end-to-end receipts only need to be used
%once, and the receipt is sent immediately. This allows some of the security
%problems otherwise associated with reply blocks, pointed out in
%\cite{babel}, to be eliminated.
%
Since a failure can also occur on the return path, 
Alice should
be able to prove one of three cases: a MIX on the forward
path failed; a MIX on the reply path failed; or Bob failed to send a
receipt.

With careful design of message formats, it is possible to make
receipt messages indistinguishable from forward messages to nodes
on the reply path. In that case, the same protocol used to claim failures
in sending forward messages will be applicable to reply messages.

\subsection{Trust Requirements for Witnesses}
\label{subsec:witness-trust}

Dishonest witnesses do not compromise anonymity, but they
can compromise reliability.
Witnesses could either refuse to give a statement or receipt for a MIX
that has failed, or (especially if users trust an arbitrary set of
witnesses chosen by a sending node)
make false statements in order to frame a MIX
that has not in fact failed.
We suggest defining a core group of witnesses who are relatively widely
trusted.  If some threshold number of this core group provide
statements implying that a node $N_{i+1}$ has failed,
that would be considered {\em sufficient} for the purposes of
{\tt Verify-claim}.
Specifying a fixed (or even slowly changing) group of witnesses is
not ideal, but if the messages sent to this group are also published,
other parties can duplicate their actions and
gain confidence in their behavior.

\section{Reputation Systems}
\label{sec:reputation}

Reputations have been suggested as a means of improving MIX-net
reliability \cite{timmay,levien}.
In layering on a reputation system, we add two more
agents to the system. \emph{Raters} 
make observations
about the performance or honesty of MIXes. In our case, the raters are
both the sender Alice and any MIXes that make use of the witnesses.
\emph{Scorers} 
tally observations from raters and make
these tallies (or \emph{scores}) available. 
For simplicity, we choose to give the scorer the duties
of verifier and witness as well.

As in \cite{potato}, sender software must be configurable to
automatically use scores.
Any user of the MIX-net must be able to contribute \emph{ratings}.
The scoring system must be verifiable: scorers can
determine the credibility of ratings, and other users can
verify that scorers are tallying ratings correctly. 
Clients must be able to draw conclusions from scores
that lead to ``good'' predictions. The overall scoring algorithm must be
dynamic, recognizing and reflecting recent trends in MIX performance. 
While achieving all of these goals, the system must also maintain the
level of anonymity provided by the MIX-net.

\subsection{Reputation System Overview}

We introduce a set of scorers, each named Sally. Each Sally
keeps her own database of performance scores for MIXes.
She receives, verifies, and tallies failure claims.
Sally also sends test messages to distinguish reliable
MIXes (few delivery failures due to good performance)
from new MIXes (few delivery failures because nobody has tried
them yet).
%
%We would like to develop a flexible scoring system, modelling the
%actual scoring process as arbitrary reputation functions.  The inputs
%to Sally are success or failure claims; she outputs some probabilistic
%reputation distribution for each node (the variance of this
%distribution reflects her confidence).

If we simply count the number of messages that each MIX drops,
an effective attack would be to constantly add new unreliable
MIXes. Therefore scores include both a count of negative ratings, and
also a ``minimum number of positive ratings'' requirement, which is a
threshold configurable on the client side.
Client software downloads Sally's reputation database, 
and allows the user to
specify parameters when selecting acceptable MIX paths, such as
``expected success of transmission''.

%As we showed in Section \ref{subsec:proving}, each negative rating
%really does represent a failed delivery.
%That is, if a valid message
%onion is sent to $N_i$, then it really is $N_i$'s fault if he doesn't
%either deliver the decrypted onion to $N_{i+1}$ within the allowed time
%window, or get statements from credible witnesses saying that $N_{i+1}$
%was down.  Anybody who is able to demonstrate that it was a valid onion
%is either that MIX or the onion's author; the MIX has no incentive to
%publicize that it failed at a transaction.
If our assumptions in Section \ref{subsec:proving} hold,
there is no way to spoof negative
ratings.  Note that an adversary may be able to force negative ratings
on a MIX, while goal 2 still holds: if he floods that MIX's incoming
bandwidth (either directly or via witnesses), the MIX will no longer
be able to sustain the load. 
However, this is exactly the point
where the MIX demonstrates that it is unreliable.
Causing MIXes to lose reputation in the face of
\emph{successful} flooding is consistent with our scoring system
goals: the scoring system measures reliability and
capabilities, not intent.

\subsection{Increasing Confidence in Positive Ratings}
\label{sec:rep-solutions}

An adversary can easily fake a positive rating by building a message
which uses a MIX path entirely owned by him, and then generating a transcript
which ``proves'' successful transmission.
%
%Algorithms that consider the number of positive ratings (either
%by adding positive and negative ratings,
% as eBay does,
%or by taking the
%ratio of positive ratings to negative ratings) are generally going to be
%vulnerable to this sort of shilling attack.
%
%However, using only the number of negative ratings to calculate reputation
%score gives a perfect score to completely untested MIXes.
We need to make
positive ratings actually reflect a MIX's ability to successfully
deliver Alice's message in the future.

One approach to making positive ratings more reliable (and thus more
meaningful) is to build a graph based on rater credibility
%that addresses reputation transitivity,
such as that employed by Advogato
\cite{advogato}.  
Similar to the PGP web of trust, this
%technique builds a directed graph where edge weights represent trust
%that one vertex places in the other; in our case, each vertex would be
%a MIX. Calculating maximum flow from some node $A$ in the graph to
%some node $B$ shows how much trust $A$ places in $B$, even if $A$
%never directly rated $B$.
metric aims to reduce the damage that an
adversary can cause by \emph{pseudospoofing} -- creating a multitude
of identities each controlled by that adversary.  
%At first, each of
%the new nodes is connected only to itself in the graph.  Even after the
%pseudospoofing nodes begin to form connections with the rest of the graph,
%there will still be ``trust bottlenecks'' that limit the amount of trust
%arriving at any of the pseudospoofing nodes.
%
Another approach is to treat reputation as a probability: an estimate
of the expected outcome of a transaction with that MIX. Scores might
simply be the sum of ratings, normalized and weighted by
the credibility of raters. 
Other designs employ neural networks or
data clustering techniques to apply non-linear fitting and optimization
systems to the field of reputation.
% However, such complex solutions may
%not be easily understood or verified by the users of the reputation score.
%Moreover, to our knowledge these more statistical approaches to aggregating
%reputation values have never been studied in the context of adversaries
%who understand and attempt to exploit the algorithm.  Such evaluation
%techniques which are intended for ``clean'' environments can introduce
%serious vulnerabilities and complications, which in turn require rater
%credibility tracking systems to track which raters are providing
%``good'' data.
%Rather than developing a complex and brittle rater credibility system,

Our solution emphasizes simplicity and ease of implementation and
evaluation.
We solve the positive rating credibility problem by having each Sally
produce positive ratings herself --- after all, if Sally sends the test
messages herself, she knows they are unbiased ratings. MIXes that
process her message will earn positive ratings that Sally knows to be accurate.

%In order to ensure that positive ratings are accurate, test messages
%should be indistinguishable from real messages (otherwise, an adversary
%could behave reliably for test messages, but unreliably for real messages).
%In addition, nodes should not be able to determine that the sender of a message
%is a rater; otherwise the MIX could decide to behave well only for a small
%set of senders and badly for everyone else. This is similar to the problem
%restaurant critics face when attempting to determine a restaurant's ``real''
%level of service. 

%% I think we definitely need to make this point. --DH
It may be possible for an adversary to guess whether a message
has been received directly from a sender (i.e. the adversary is the first
hop on the path, $N_1$), or if it is being sent to the final recipient
(i.e. the adversary is $N_{k-1}$). Unfortunately, it is difficult to
produce simulated messages that are completely indistinguishable from
real messages in these cases. We do not have a fully satisfactory solution
to this for positive ratings; instead, we rely on negative ratings to
expose an adversary that behaves unreliably only when it is the first or
last hop.

%%Agreed, this is sub-optimal. Leave it out. -rd
%MIXes may provide preferential service to witnesses, just as
%restaurants may provide preferential service to restaurant critics.
%Witnesses may choose to send
%their messages via one hop of the MIX-net as well, to hide their identity;
%this complicates the protocol and timing, though.
%% I don't agree. --DH

Sally should expire her memories of transactions after a certain
amount of time so that old failures do not haunt a MIX
forever.  Similarly, MIXes need to have a threshold of recent
successes in order to stay ``in the running''.
Alice configures her client software to choose only MIXes that
have some minimum number of positive ratings. Out of this pool,
she weights the MIXes she uses for her path based on the number
of verified delivery failures observed for this MIX.

This system reacts quickly to a
decrease in reliability of a MIX. A MIX with high reputation will
have many users routing messages through it.
if it suddenly stops delivering messages, these users will
quickly deliver a series of negative ratings.
This negative feedback process serves to
stabilize the system so scores reflect reality. 

For redundancy and to allow verifiability of scorers,
Alice can remember which mails had a corresponding end-to-end receipt,
tally her transactions, and
build her own score table in which she is confident of both
the positive ratings and the negative ratings.  Periodically
comparing her score tables with those of the available Sally's allows
Alice to ``test the waters'' herself and weakens
the trust requirements on scorers.
If claims for dropped messages are published, anybody can
verify them. Thus Alice might keep track of negative ratings for a few
weeks, then compare with Sally to determine if Sally's scores are
actually reflecting all of the negative ratings.

\subsection{Implications for Traffic Analysis}
\label{sec:traffic}

The addition of witnesses to the protocol may introduce new attacks.
Direct communications between nodes can use link encryption, but encrypting
the messages to witnesses would have little benefit (and would be
incompatible with publishing these messages, as suggested in Section
\ref{subsec:witness-trust}). So if an adversary can force the
witnesses to be used instead of direct communication, this may weaken
the security of the network. 
%On the other hand, link encryption of messages
%between nodes is not strictly necessary to achieve anonymity in a
%MIX-net; it is only a ``belt-and-braces'' measure.
%% A lot of context is needed to describe marking attacks properly -
%% I'll leave it to another paper, maybe. --DH

The reputation system also introduces new attacks.
Eve could gain a high reputation and thus get more traffic routed through her
MIX, in order to make traffic analysis easier.  In a system
without reputations, the way to purchase more traffic to analyze is
not so clear; now it is simply a matter of maintaining a reliable MIX.
In addition, the adversary now has incentive to degrade or sabotage the
performance of other nodes to make his relative reputation
higher. This kind of attack was described by RProcess as ``selective
denial of service'': the bad guys want traffic to go through their nodes,
so they ensure that all other nodes are less reliable \cite{RProcess}.
As recent distributed denial of service attacks demonstrate, crippling
an Internet host can be easy. Scorers must expire
ratings promptly enough to prevent an adversary from easily tarnishing
the reputations of all other MIXes; this system tuning will be extremely complex and difficult.

On the other hand, we may be
making the system \emph{more} anonymous by making it more
reliable. Currently, users may have to send a message many different
times, through many different paths, before it gets through.
These re-sends of the same message offer more
information to traffic analyzers.
A better theoretical framework for traffic analysis needs to be
established before we can make any quantifiable statements about the
implications of the proposed protocol.

\section{Conclusion and Future Directions}
\label{sec:conclusion}

We have described a reputation system for remailer
networks, based on a MIX-net design that employs receipts for
intermediate messages. 
There are a number of directions for future research:

\begin{itemize}

\item Create a ``reliability metric'' and an accompanying model which will allow
us to quantify the behavior of our reputation system. For instance, we might
consider as a metric the expected probability of a message making it from
sender to receiver. Then we would
calculate reliability with and without the reputation system in place and
see whether reliability improves.
A parallel model characterizing efficiency of a MIX network might be very
enlightening, especially from a network flow optimization viewpoint.

\item Can we achieve some further measure of reliability (or resource
management) through the use of electronic cash or similar
accountability measures?
% self-cite O'Reilly P2P chapter?
%Indeed, the NymIP group \cite{nymip} has a long list of similar
%improvements that need to be further analyzed and developed.

\item Can we defend against the selective DoS attack
described in Section \ref{sec:traffic}? How can we protect against
adversaries who want their MIXes listed as most reliable?

\item Can we make a reputation system that is both efficient and
universally verifiable? Currently, only Alice can prove that
a message did not reach its destination. Can we apply zero-knowledge
proofs so that Alice does not leak any information about the next hop,
while remaining practical?
Can we extend this so that anyone can detect a failed MIX?

\end{itemize}

This paper provides a foundation for further analysis of MIX-net
reliability and reputations. Our 
reputation system is
designed to be simple and easily extensible. Much work remains in a wide
variety of directions before a reliable, secure, and ubiquitous remailer
network can be put in place.

\section*{Acknowledgements}

We thank Nick Mathewson and Blake Meike for help with the reputation
system; Nick Feamster, Kevin Fu, Chris Laas, Anna Lysyanskaya, and Marc
Waldman for discussions; and our anonymous reviewers for many useful
comments.

\bibliographystyle{plain} \bibliography{mix-acc}

\end{document}


