%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Overview
 - The problem we're trying to solve.
 - Background: anonymity systems
 - Background: Internet routing
 - Some things we learned
 - Questions we still need to answer

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The big picture

Alice wants to transact with Bob on the Internet (fetch a web page,
send an email) without letting anybody link them together.

Alice wants to be safe from somebody watching her or somebody watching
Bob.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

A trusted proxy isn't good enough

<picture of single-hop proxy>

These proxies are trust and performance bottlenecks.
Add a constraint: want to be safe from a compromised middle node too.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

So: distributed trust

<picture of network of nodes>

By using multiple hops, no single node can link Alice to Bob.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Tor and Mixmaster

Two widely deployed anonymity networks (thousands of users each).

Tor is for TCP streams (low-latency).

Mixmaster is for email (high-latency).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Two major attack classes

Follow-the-transaction attack: try to learn each hop of a
transaction and follow it from beginning to end.

Endpoint attack: use statistics to match a transaction coming into the
network to a transaction leaving the network.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Three major defense classes

Batching and pooling: delay messages to get a large batch, and mix them
together to hinder the adversary from linking Alice's message to Alice.

Padding: senders provide decoy traffic as well as normal traffic.

Dispersal: reduce the chance that the adversary sees enough of the
network to complete his attack.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Dispersal approaches

Grow the network so a given adversary sees less.

Arrange the topology so messages can enter or exit at many
places (e.g. cascade vs free route).

Location arbitrage: spread each transaction over multiple jurisdictions.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Many different families of locations

E.g. Areas controlled by a single country, state, company, ...

E.g. Nodes running the same operating system or class of software.

We focus here on the family of locations that are ISPs. More
correctly, \emph{autonomous systems} (ASes).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

AS-level paths

The key insight is that while we typically think of a connection as going
from Alice to Node1, actually it traverses many different ASes for that
single hop.

Paths based on policies, not just shortest-path

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Need to passively estimate paths

Can't pull down all routing tables; can't traceroute.

Used Oregon RouteViews Project data to learn adjacencies.

Mao et al's [24] estimation technique is >80\% right.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Location independence metric

What is the chance that some AS is on both the path from Alice to the
mix-net, and also the path from the mix-net to Bob?

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Question one

Is considering IP prefix good enough?

Tarzan, Morphmix, etc recommend this.

Not the same. In practice, we see several cases of same-AS nodes
with different prefixes

Of the 5 pairs in Mixmaster in the same AS, three have different class
A prefixes!

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Question two (1)

How much can one AS attack inside the network?

This lets him follow a transaction (easier than doing stats).

Also means we're not getting the full protection of n hops
that we thought we were.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Question two (2)

Top two AS-level between-node observers in the US, to both Tor and
Mixmaster: Level 3 and Abovenet

Together they watch over half the links in the Tor network.

Choosing paths without replacement helps: a 4-hop Tor path can be observed
by a single AS with prob .10, compared to .16 if replacement is allowed.

Don't forget that forward paths may be different from reverse paths.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Question three

How much can one AS attack the endpoints?

Remember that it's sufficient to look at endpoints of the network,
both for low-latency or high-latency networks.

Endpoints can be first and last node, but they can also be Alice and Bob.

We picked some reasonable sounding Alices and Bobs, mostly in the US.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Endpoint attacks are an issue

Given random entry and exit points, a single AS will often be
able to win 10\% to 30\% of the time.

It's possible to reduce this to almost 0.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Node placement

With US-based Alices and Bobs, adding a far-flung node (e.g. in Asia)
*hurts* us, not helps.

Best node placement for protection against the AS-level adversary is in
ASes that have the most links to other ASes: tier-1 ISPs.

A given transaction is safest when Alice, Bob, or both are in tier-1 ISPs.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Future work (1)

Consider a more diverse set of Alices and Bobs.

How to get routing table to Alice? Are there practical approximations
that still work ok?

How sensitive is this metric to adding or subtracting a few nodes?

What about repeated web fetches, using different entry and exit points
each time -- how quickly does Alice's location independence degrade?

Correlation between speed/reliability of network and its location
independence?

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Future work (2)

Do this analysis for different location metrics, such as countries.

Caching at exit nodes (when feasible) changes the equation.

Akamai? Different routing; also dangerous observer.

Do we *hurt* anonymity by restricting path choices, against larger
adversaries who can take advantage of knowing our algorithm?