Skip navigation
Please use this identifier to cite or link to this item: http://hdl.handle.net/2451/31830
Title: Estimating Audience Interest Distribution based on Audience Web Behavior
Authors: Zhang, Xiaohan
Kiril, Tsemekhman
Provost, Foster
Issue Date: 19-Jun-2013
Series/Report no.: CBA-13-01;
Abstract: The increasing availability of massive data on users' online behavior presents exciting opportunities for business analytics. In particular, if we could model the distributions of interests of visitors to webpages (or websites), we could apply the result to applications including site optimization, advertisement targeting, content creation, internal offer merchandizing, sponsorship, and general customer analytics. We first present a two-stage generative model for estimating audience interest distributions (AID) for websites, based in part on estimating individual (anonymized) user interest distributions from their observed visitation patterns to labeled websites. The model yields the following interpretation: the AID is the expected interest distribution of a visitor to the website. Estimating AID is important for several reasons: (i) contextual categorization of websites is expensive and/or error prone at large scale, (ii) even under favorable assumptions, contextual categorization provides only a narrow view of user interests, and (iii) certain sorts of sites (image, video, social) do not lend themselves to easy/accurate contextual categorization. The paper then demonstrates and evaluates the model on a massive set of (anonymized) data from a large online advertising company. We show two main findings. (1) In a predictive modeling-style evaluation, for sites where user interests are (partially) known, the model predicts them well. (2) For pages where con- textual categorization does not estimate user interests well (specifically, image pages), the model does estimate them well. We also provide qualitative results demonstrating how the model can reveal interests that are not apparent from contextual categorization.
URI: http://hdl.handle.net/2451/31830
Appears in Collections:Center for Business Analytics Working Papers

Files in This Item:
There are no files associated with this item.


Items in FDA are protected by copyright, with all rights reserved, unless otherwise indicated.