\documentclass{article}
\usepackage{fullpage}
\usepackage{fancyheadings}
\usepackage{multicol}
\title{Trust Economies in the Free Haven Project\\A Thesis Proposal}
\author{Brian T. Sniffen}
\newcommand{\thedate}{March 1, 2000}
\date{\thedate}
\pagestyle{fancy}
\pagenumbering{arabic}
\lhead{Brian T. Sniffen (\emph{brians@mit.edu})\\6.199 AUP Proposal}
\rhead{Free Haven Project\\\thedate}
\begin{document}
\maketitle
\begin{multicols}{2}
\section{Introduction}
The intent of the Free Haven Project is to create a system for
anonymous publication and retrieval of information in such a way that
information, once injected into the system, is very difficult to
remove.  My role in the project involves the creation of the
heuristics for the trust economy system and the implementation of that
system.

\section{The Free Haven Project}
The Free Haven is a project to develop and implement a distributed
method of secure anonymous fault-tolerant data storage.  Based on the
following assumptions\footnote{Roger Dingledine.  Free Haven Abstract.
http://www.seul.org/archives/freehaven/dev/Feb-2000/msg00067.html}, we
are developing an economy of trust and implementing the first public
secure data haven.

\begin{quotation}
\small The Internet is growing at an unprecedented rate. However,
while technical advances are providing greatly increased bandwidth and
well-connected storage capacity, support for privacy and anonymity on
the Internet is largely unchanged. Commercial enterprises as well as
free software projects are hoping to help solve this problem: examples
include Zero Knowledge Systems, a company building its own
closed-source private network of low-latency mixnets, and the Freenet
project, a group of Internet programmers designing a network that will
duplicate frequently retrieved information and thus make it difficult
to delete information. Most current works suffer either from
closed or unfinished source. More importantly, though, their designs
sacrifice anonymity for accessibility.  The Free Haven Project aims to
design and deploy a system which uses a secure mixnet for
communication, and which emphasizes distributed, reliable, and
anonymous storage over efficient retrieval. Some of the problems we
address include providing sufficient accountability without
sacrificing anonymity, building trust between servers based entirely
on their observed behavior, and providing user interfaces that will
make the system usable for end-users.
\end{quotation}

\subsection{Implementation}
The Free Haven system is based around two networks of servers: the
Servnet and the Mixnet.  The \emph{servnet} is composed of
high-storage-capacity machines which store data and arrange to trade
shares of data among themselves.  The \emph{mixnet} is composed of
high-bandwidth and high-connectivity machines which anonymously pass
messages between clients and the servnet nodes.

The Free Haven relies to a great extent on the existence of a secure
pseudonymous mixnet.  Given that such a system has not been deployed,
the Free Haven project is developing its own mixnet, which allows both
anonymous and pseudonymous point-to-point transmissions.  The mixnet
is merely a medium of communication, however.  The heart of Free Haven
lies in the servnet.

Each node in the servnet hosts a certain amount of data from other
nodes in exchange for the ability to store its data elsewhere.  The
system is designed to conform to the following specifications:
\begin{enumerate}
\item Users may anonymously insert files into the servnet.
\item Users may anonymously retrieve files from the servnet.
\item Servers may be added and dropped from the servnet.
\item Files must be recoverable given server failure.
\item The current location of files should not be known.
\item The system must be decentralized to maintain efficiency, security,
and reliability.
\item Malicious servers should not be trusted with data.
\end{enumerate}

The mixnet deals with points (1) and (2) above.  Point (3) is
implemented by use of expiration dates for files; a server which
wishes to leave the servnet can simply wait for its files to expire,
or trade\footnote{Trading is explained below} for files which expire
more quickly.  To implement point (4), we are using an information
dispersal algorithm to break each file in the Haven into
\emph{shares}, signing each piece with a private key which we then
dispose of.  Only a fraction of the shares are necessary to
reconstruct the original file.  This provides resilience against
malicious or faulty nodes in the servnet.  Individual shares are
traded around the servnet, making it difficult to reliably locate even
a piece of a file.  Pieces are recalled by broadcasting over the
mixnet a request for all shares which have been signed with a certain
public key.

Points (3) through (7) are dealt with by means of a \emph{trust
economy system}.  This economy allows us to find an acceptable
compromise between accountability and anonymity.  In general, a server
is trusted to do no more work than it has already performed.  This
ensures that even maximally malicious servers provide at least a 50\%
ratio of useful work.

\subsection{Expected Threats}

In implementing the above goals, the system is going to come under a
variety of attacks.  We have attempted to consider most of these
in advance.  There are three primary modes of attack: on the communications
medium,
on the Servnet, and on individual files.  We have considered all three
of these, and take appropriate countermeasures.  In particular, we
expect the following sorts of attacks:

\begin{enumerate}
\item{Attacks on the Communications Medium}
\begin{enumerate}
\item The communications medium is either vulnerable to traffic
analysis or involves a high degree of latency -- and thus requires a
great deal of space for storing messages in transit.
\item Traffic Analysis Attacks are normally difficult.  They can be
made easier by volunteering to become part of the communications
medium.  By becoming part of the communications medium, it would be
possible to observe the traffic more closely and get a better idea
of where it's coming from or where it's going.
\item Attack the time-synchronization protocol to reduce the latency
of the system and ease traffic analysis.
\item Denial of service attack on the communications medium: flooding
it with messages may well increase latency beyond tolerable limits, or
actually crash some of the nodes.
\end{enumerate}

\item{Attacks on the Servnet}
\begin{enumerate}
\item Go find a physical servnet node, and prosecute the owner based
on its contents.
\item Physically destroy a servnet node.  This attacks the
trust-system as well as the integrity of the data in the network.
\item Become part of the Servnet.  Earn trust, then betray it.  We need an analysis of how much trust
(in megabyte-days or comparable units) you need to betray in order to do real damage.
\item Attack the time-synchronization protocol to make files expire
earlier than expected. 
\item Make it appear that a servnet node has violated the protocols in
such a way as to reduce trust in it.
\item Appear to violate the protocols.  When someone points this out,
show him wrong and accuse him of the above attack.
\item Claim that the servnet or mixnet concept is patented or
otherwise illegal.  Sue the Free Haven Project and any known node
administrators.
\item Trade garbage into the servnet, gain trust, then delete what you
get before it expires. We need an analysis of how much damage can be
done per unit of useful work.
\item Attack the generosity of individuals: increase the personal cost
of running a servnet or mixnet node, either by adding a monetary cost
to moving large quantities of data around, or by adding a bad
reputation such as ``harboring terrorist data and kiddie porn''.  %We
%rely on the strength of our comrades revolutionary consciousness for
%protection against this attack.
\item Denial of service attack on the servnet: continued flooding of
queries for data or requests to join the servnet may use up all available
bandwidth and processing power for a node.
\end{enumerate}

\item{Attacks on a File}
\begin{enumerate}
\item Attack to find the publisher of a document, or anybody else who has
information about the document or its location.
\item Attack to find people interested in a particular document.
Claim to have one, and see who requests it.
\item Swap until you have enough control over a file you object to,
then destroy it.
\item Conspire to make a cause ``unpopular''.  Convince servnet node
administrators that they don't want to be hosting data for these
unpopular causes, and that they should manually prune their data.
\item Insert false shares of a file into the servnet.  
\end{enumerate}
\end{enumerate}

\subsection{Defenses against Expected Threats}

In order to counter the anticipated attacks, we have evolved a number of defenses:

\begin{enumerate}
\item{Physical Security:} This is the responsibility of individual servnet node
owners.  In the event of failure here, we fall back on the robustness of the
data storage system. \emph{2a, 2b}
\item{The Trust System:} This serves two functions.  First, it ensures that
anybody entering the servnet provides useful work in proportion to the amount
of damage they can do.  Second, it provides a measure of accountability.  Those
who do good things are trusted to do good things again, and those who do bad things
are trusted to do bad things again. \emph{2c, 2e, 2f, 2h}
\item{The Mixnet:} This preserves the essential anonymity and untracability of
the service.  Its latency also protects against traffic analysis. \emph{1a, 1b, 2b, 3a, 3b}
\item{Cryptographic Signatures of Fragments:}  Strong cryptography protects us
against spoofing of shares, as well as providing a convenient tagging mechanism. \emph{3e}
\item{Data Duplication:} By dividing a file into many shares, not
all of which are needed to reconstruct it, we gain protection against
the destruction or duplicity of Servnet nodes. \emph{2c, 2b, 2h, 3c}
\item{Legal Protections:} Information illegal in one place is
frequently legal in others.  Global oppression of a piece of
information is relatively rare.  The content-neutral policies mean that there is
no reason to expect that the server operator has looked at the data he holds, which
might make it more difficult to prosecute. \emph{2a, 2g}
\item{The Time-Synchronization Protocol:} We rely on its security as
an abstracted object. \emph{1c, 2d}
\item{Volunteerism and the Hacker Ethic:} Owning a node of this
service is still going to put an administrator in a potentially tricky
situation.  We rely on the Hacker ethic and a commitment to free
information flow to provide volunteers \emph{2i, 3d}
\end{enumerate}

\section{The Trust Economy System}
The Free Haven Project relies on an economy of trust to protect the
data stored within it.  It requires that servers participating in the
Haven demonstrate in advance a capability to do a certain amount of
useful work before they are trusted with important data.  The goal is
to limit the amount of damage a potential adversary can do by
requiring useful work as a prerequisite for the access necessary to do
damage.  In the simplistic setup of the Free Haven Server, the
commodity chosen is megabyte-months.  A server which has stored ten
megabytes of data for two months will be trusted with up to twenty
megabyte-months of further data: eighty megabytes for a week, or one
megabyte for twenty months, or anything in between.

By using this currency, we ensure that an adversary will always
provide at least as much useful work as it does harm --- that is,
every system in the Haven network is at least 50\% useful.  In addition,
certificates of accountability are used so that a server which is
misbehaving to this extent will be reasonably quickly identified to
the rest of the network.  When a server is entrusted with a share of
data, it provides a receipt certificate to the server which previously
held the data.  Should the trusted server lose the data, the server
with the certificate will be able to hold it accountable to the rest
of the network.

A rich adversary can easily defeat this simplistic currency model by
purchasing a ten-terabyte server and donating a great deal of space to
the network for a few days, then dropping great quantities of data.
To prevent this attack, the Free Haven Project is using a
two-dimensional currency.  A server can be trusted with a share which
is close in duration-size space to shares it has safely stored before,
but within limits specified by the individual server administrator.
For example, a 10 MB share for 1 month would have a distance of 5 from
a 7 MB share for 5 months.  The typical install would allow a distance
of up to 2, with a limit that times less than a week count as a week,
and that sizes over 100 GB count as 100 GB.         
                                                    
We anticipate that even this economic model will be insufficient for
some scenarios, and that there will be ways of attacking or exploiting
whatever we implement in the first versions of Free Haven. We expect
to continue to adjust these values and assumptions based on
experimental data.                                  

\subsection{Interfacing to the Free Haven System}
The trust system will be implemented as a library of functions which
access a database of servnet nodes.  This database will be implemented
with a free portable database engine; to ensure modularity, the code
should allow easy switching between different database back-ends.  The
trust library should be able to answer certain questions from the
haven and communications modules:

\begin{itemize}
\item Should we trade a share of size $s$ and expiration date $d$ to host $H$?
\item What sort of share should we offer to trade with host $H$?
\item What sort of share should we offer to trade?
\item Whom should we offer a trade to?
\item Whom should we offer to trade share $S$ to?
\item What sort of share should we ask for from host $H$ in exchange
for a share of size $s$ and expiration date $d$?
\item What sort of share should we trade to host $H$?
\item Is our share $R$ for host $H$'s share $S$ an acceptable trade?
\item What is the public key for host $H$?
\end{itemize}

In order to answer these questions, the trust library is going to need to accept certain information from the other modules:
\begin{itemize}
\item We traded our share $R$ for host $H$'s share $S$ at time $t$.
\item Host $H$ dropped share $S$.
\item Host $X$ says host $Y$ has trust rating $x$ and public key $k$.
\item Host $X$ says his new public key is $k$.
\item Host $X$ says his private key has been compromised.
\end{itemize}

The trust database also needs to keep itself updated, automatically
increasing trust ratings for nodes which keep shares until they
expire, and triggering trust-building trades by properly answering the
above questions.

\section{Conclusion}
The Free Haven Project will design and implement a servnet and mixnet
as described above.  The completed Free Haven will support anonymous
publication, storage, and retrieval of documents, together with a
pseudonymous trust model.  My work concerns developing the model for
the trust economy and implementing the server side trust software.
This work is being performed in collaboration with developers of the
communications module, the haven module for share storage, file
publishing, and file retrieval, and the user interface.  Together,
these modules will provide for an initial Free Haven release.

\section{Acknowledgements}
I would like to thank the following people for their assistance with this project:
\begin{itemize}
\item Roger Dingledine, for the creation of the Free Haven project, and a great deal of commentary on the issues of the trust economy.
\item The members of the Free Haven Project, for many hours of fruitful discussion.
\item Professor Ron Rivest, for his role as my project advisor and for
suggestions for the overall Free Haven design.
\end{itemize}
\end{multicols}
\end{document}
