Learning and Inference in Massive Social Networks

Hill, Shawndra; Provost, Foster; Volinsky, Chris

Full metadata record

DC Field	Value	Language
dc.contributor.author	Hill, Shawndra	-
dc.contributor.author	Provost, Foster	-
dc.contributor.author	Volinsky, Chris	-
dc.date.accessioned	2008-12-03T17:53:17Z	-
dc.date.available	2008-12-03T17:53:17Z	-
dc.date.issued	2007-08	-
dc.identifier.citation	The 5th International Workshop on Mining and Learning with Graphs, August 2007	en
dc.identifier.uri	http://hdl.handle.net/2451/27812	-
dc.description.abstract	Researchers and practitioners increasingly are gaining access to data on explicit social networks. For example, telecommunications and technology firms record data on consumer networks (via phone calls, emails, voice-over-IP, instant messaging), and social-network portal sites such as MySpace, Friendster and Facebook record consumer-generated data on social networks. Inference for fraud detection [5, 3, 8], marketing [9], and other tasks can be improved with learned models that take social networks into account and with collective inference [12], which allows inferences about nodes in the network to affect each other. However, these socialnetwork graphs can be huge, comprising millions to billions of nodes and one or two orders of magnitude more links. This paper studies the application of collective inference to improve prediction over a massive graph. Faced initially with a social network comprising hundreds of millions of nodes and a few billion edges, our goal is: to produce an approximate consumer network that is orders of magnitude smaller, but still facilitates improved performance via collective inference. We introduce a sampling technique designed to reduce the size of the network by many orders of magnitude, but to keep linkages that facilitate improved prediction via collective inference. In short, the sampling scheme operates as follows: (1) choose a set of nodes of interest; (2) then, in analogy to snowball sampling [14], grow local graphs around these nodes, adding their social networks, their neighbors’ social networks, and so on; (3) next, prune these local graphs of edges which are expected to contribute little to the collective inference; (4) finally, connect the local graphs together to form a graph with (hopefully) useful inference connectivity. We apply this sampling method to assess whether collective inference can improve learned targeted-marketing models for a social network of consumers of telecommunication services. Prior work [9] has shown improvement to the learning of targeting models by including social-neighborhood information—in particular, information on existing customers in the immediate social network of a potential target. However, the improvement was restricted to the “network neighbors”, those targets linked to a prior customer thought to be good candidates for the new service. Collective inference techniques may extend the predictive influence of existing customers beyond their immediate neighborhoods. For the present work, our motivating conjecture has been that this influence can improve prediction for consumers who are not strongly connected to existing customers. Our results show that this is indeed the case: collective inference on the approximate network enables significantly improved predictive performance for non-network-neighbor consumers, and for consumers who have few links to existing customers. In the rest of this extended abstract we motivate our approach, describe our sampling method, present results on applying our approach to a large real-world target marketing campaign in the telecommunications industry, and finally discuss our findings.	en
dc.description.sponsorship	NYU, Stern School of Business, IOMS Department, Center for Digital Economy Research	en
dc.format.extent	131106 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	en
dc.relation.ispartofseries	CeDER-PP-2007-05	en
dc.subject	social networks	en
dc.subject	collective inference	en
dc.subject	viral marketing	en
dc.title	Learning and Inference in Massive Social Networks	en
dc.type	Article	en
Appears in Collections:	CeDER Published Papers

Files in This Item:

File	Description	Size	Format
CPP-05-07.pdf		128.03 kB	Adobe PDF	View/Open

Show simple item record