Broad vs Narrow: Modelling Strategies for Online Behavioural Targeting

January 6, 2012

in Behavioural Targeting

Broad vs Narrow: Modelling Strategies for Online Behavioural Targeting

This is the summary of an article by Markus Svensen, et al. You can get the pdf of the behavioral targeting article here: Broad vs Narrow: Modelling Strategies for Online Behavioural Targeting.

The estimated value of online advertising for the year 2011 is 28.5 billion dollars. With online advertising, you can reach a wide audience that are most interested in the products you offer. One kind of online advertising is display ads. For display ads to work, one can use contextual advertising in which ads put up in websites are contextually similar to the content of the site.

Behavioural targeting, on the other hand, can help advertisers reach their consumers with ads that are not necessarily contextually similar, but a user will respond to because they are relevant to his or her interests. Behavioural targeting divides online users into specified segments in a form of audience segmentation. There are many ways to do behavioral targeting. This article talks about several proposed models.

Behavioural Targeting Using Click Prediction

Click Prediction states that a user click on an advertisement is the only indication that the user is interested in the product being advertised. User click is a very observable quantity and really weighs a lot in terms of how advertisers can evaluate the effectiveness of their campaigns.

Matchbox is a probabilistic Bayesian model which is used to match users and items and group users that have a similar interest for certain items, and group items that are similarly rated by users. A Matchbox model is composed of two sub-models, a linear model and a bi-linear model. The linear model is responsible for modelling bias effects and the bi-linear model models the interaction between users and advertisements.

For the models used in this study, the user features are unique numerical ID, AgeBand, Country, Gender, and several variables related to query and page views. For Advertisements, the features are ID, Type, Industry and Size (or width and height).

Data

15 days worth of data from Microsoft display network and Bing was used. Impressions and clicks were counted daily. Overall there were 1.8 million users, 606 thousand clicks and 284 million impressions for 3,270 advertisements. Implementing more conditions, such as excluding impressions that corresponded to a small number of ads which may bias the results, the data set now comprises 174 thousand clicks, 78 million impressions, 127 thousand users and 2,793 ads.

Page views and queries are the behavioural data, and every one of these is assigned a specific category related to one another in a forest like structure.

Experiments

An experimental setup was designed to come up with the best user, advertisement and context features combination. A computer cluster was used to run many experiments and scoring various fitted models using three different performance measures called area under the receiver operator characteristic curve (AUCROC), area under the precision-recall curve (AUCPR), and marginal log-likelihood or llh.

For the first M days, data was used for training data, and the remainfrel scoring.

Results and Discussion

The Bayesian logistic regression type model came out on top as the best model among the several Matchbox models used in this study. The best strategy for using training data is to make it correspond to a single topic, as compared to when they are used to include multiple topics. The study addressed a better way to measure the various interests of individual users for more effective behavioural targeting. These information can then be used for data obtained from mobile devices, social networking sites and the like, provided that privacy concerns are resolved first.

Other articles you might like:

Leave a Comment

Previous post:

Next post: