Authors: Roberto Gonzalez, Claudio Soriente, Nikolaos Laoutaris
Partners involved: NEC, TID
Proceeding: IMC ’16 Proceedings of the 2016 Internet Measurement Conference, Pages 373-379
Abstract: Tracking users within and across websites is the base for profiling their interests, demographic types, and other information that can be monetised through targeted advertising and big data analytics. The advent of HTTPS was supposed to make profiling harder for anyone beyond the communicating end-points. In this paper we examine to what extent the above is true. We first show that by knowing the domain that a user visits, either through the Server Name Indication of the TLS protocol or through DNS, an eavesdropper can already derive basic profiling information, especially for domains whose content is homogeneous. For domains carrying a variety of categories that depend on the particular page that a user visits, i.e., news portals, e-commerce sites, etc., the basic profiling technique fails. Still, accurate profiling remains possible through traffic fingerprinting that uses network traffic signatures to infer the exact page that a user is browsing, even under HTTPS. We demonstrate that transport-layer fingerprinting remains robust and scalable despite hurdles such as caching, dynamic content for different device types etc. Overall our results indicate that although HTTPS makes profiling more difficult, it does not eradicate it by any means.
- We study the effectiveness of HTTPS to protect users from the profiling by network observers when they are browsing the internet.
- Our main finding demonstrate that an observer could use the SNI header of TLS and other sources of information to make a raft profile of the users.
- Moreover, we define a method that, using traffic fingerprinting, is able to profile online user with high accuracy.
Read the entire paper here.