This is the summary of the article written by Maciá-Fernández, et al., which suggests a method that ISP’s can use for behavioral ad targeting without violating the Federal Wiretap Act. You can get the PDF of the behavioral targeting article here: ISP-Enabled Behavioral Ad Targeting..
ISP’s efforts to implement behavioral ad targeting
Online advertising is worth $20 billion, with the biggest player in Google. Google has gone on to become rich because of their subscriber’s web browsing. In contrast, ISPs have been reduced to ferrying internet traffic for subscribers, but unable to win their online spending. To resolve this, some ISPs resort to collaborating with companies that use deep packet inspection techniques; techniques that intercept web page requests to apply behavioral ad targeting. However, there is a legal issue behind this. ISPs, unlike Google, are not exempt from the Federal Wiretap Act, which simply states that it is illegal to “intercept the contents of communications.”
To somewhat resolve this issue, some ISPs developed “performance advertising services,” which allowed users to opt-out of the ad targeting anytime. Still, there’s a general fear that opting-out doesn’t include online data collection.
Using TCP information for legal ad targeting
However, this study shows that ISPs can make their behavioral ads and user tracking legal. For this, the controversial deep packet inspection techniques and reverse engineering user browsing patterns must be abandoned for an alternative. The main challenge is if it’s possible to use the browsing patterns visible in the limited information found in the TCP layer. Then, website features are extracted for profiling. These features include location, link, cacheability, etc. The two kinds of data are then correlated through a designed algorithm.
This study was successful in looking for web browsing patterns in the TCP headers. The previous illegal technique involved packet-level network traces, and the method used in this study is aimed at knowing the websites visited by users without touching the packet payload, or the essential data carried by the packets. Furthermore, this approach is made for access ISPs. These are ISPs that are required to record TCP packet headers somewhere where it can be tapped in the network.
The first goal is to extract total number of web pages entered by client, and location and size of root files, a file that is found in every website and is connected to objects defining the website. These user subtraces are then separated into slices. The TCP packet headers of each slice are then used to recover website features. Then, they use what they call a Detection Algorithm, to correlate the information in website profiling and the features extracted from TCP headers. Several phases are described in the algorithm, including tagging phase, and selection phase.
The algorithm was successful in obtaining very high detection rates. It is mentioned that this high quality performance is attributed to the algorithms ability to obtain statistical page diversity at all explored websites. Furthermore, when websites are outdated, the algorithm shows that its detection is flexible with time.