Fingerprinting SecureDrop .onion services


#1

Bonjour,

In a recent paper, How Unique is Your .onion? the authors included some SecureDrop .onion sites and calculated a score based on algorithms. Among them is Effective Attacks and Provable Defenses for Website Fingerprinting.

It would be interesting to repeat the process to confirm the results of the paper. And we could also add that to the SecureDrop integration tests. I’m not saying the results in the paper are incorrect: I will only be able to formulate an opinion when I’m able to reproduce them.

Cheers


#2

Note we did have a project to do this (now defunct - this would take significant effort to convert into integration tests): https://github.com/freedomofpress/fingerprint-securedrop/


#3

research into what kind of random padding would effectively mitigate this threat https://petsymposium.org/2017/papers/issue2/paper54-2017-2-source.pdf


#4

My personal opinion is figure out some random padding and just do it. For nearly two years the project has collectively been aware of the risk of website fingerprinting, a problem amplified by the small set of Tor hidden services and the fact that almost every SecureDrop site is identical. That it’s within the realm of possibility has been already borne out by the research. We invested in crawling / machine learning tools in order to figure out if we can do it ourselves and get positive results, seemingly in order to confirm what we already know. Meanwhile, during the whole time folks were doing that, SecureDrop sites have continued to reside on the onion web, vulnerable to passive attacks, with no defenses deployed. We suspect random padding at some layer will help marginally; we just don’t know how much. However, we already know that adding random padding wouldn’t necessarily hurt—IMO there is no excuse, you should just do it.


#5

Unless someone else is willing, I’ll start working on this tomorrow.


#6

Sorry - I missed this thread - we get notifications in the FPF slack for comments on GitHub but not yet Discourse.

However, we already know that adding random padding wouldn’t necessarily hurt—IMO there is no excuse, you should just do it.

Just limiting the conversation here to the paper in question, it states that the most distinguishing feature for SecureDrop was the large size of the site (total incoming packet size). Adding random padding does not address this. Indeed, it may make* the situation worse, as already very large sites with random padding may be even easier to fingerprint.

From the paper on SecureDrop specifically:

In particular, we noted that these pages embed images and use scripts and CSS styles that make them large and therefore distinguishable.

If we want to move toward to goal of making SecureDrop less fingerprintable, we should first decrease the size of the site, which is in line with e.g. https://github.com/freedomofpress/securedrop/pull/1298.

If it was so simple, then we would indeed “just do it”, but this is a complicated problem as evidenced by the many academic computer science researchers working on this problem with suggestions evolving with each paper ;).

[*] note we can’t say definitely, as we have no method for testing. Furthermore, I do not think the SecureDrop engineering team should spend significant effort on developing this testing framework and instead we should focus our limited resources on addressing the much easier attacks on source anonymity that we know are currently happening.




#7

Hey @redshiftzero

Thanks for explaining :slight_smile:

Is there a place where I could learn more about that ? It makes perfect sense to me that a web page with a fixed size can be used to fingerprint a site (it may be the only page in all THS with this size), I don’t get how a large page can be a problem. Either it is the largest page of all THS services in which case, well yes, that makes sense. But I’d be surprised if it was the case :wink: Or its size is in the largest pages available in all THS and I understand if it would be better if its size would be closer to the average. But even in this case varying the size of the page looks like an effective way of improving the situation. I’m not trying to make a point: I’m very ignorant about all this. I’m writing my reasoning so you can better point at what’s wrong with it.

I do not think the SecureDrop engineering team should spend significant effort on developing this testing framework and instead we should focus our limited resources on addressing the much easier attacks on source anonymity that we know are currently happening.

I agree 100%. I was hoping for an easy way to improve things and made this pull request in that spirit. I’m not very excited at the idea of spending weeks working on this specific topic, specially since there is no way to repeat the conclusions found in the academic papers.

However I’m very motivated to understand why a fixed size is not the easiest fingerprinting signature.

Cheers