|
|
Archive@NYU >
Stern School of Business >
CeDER Working Papers >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2451/27680
|
| Title: | Estimating the Socio-Economic Impact of Product Reviews: Mining Text and
Reviewer Characteristics |
| Authors: | Ghose, Anindya Ipeirotis, Panagiotis G. |
| Issue Date: | 2-Sep-2008 |
| Series/Report no.: | CeDER-08-06 |
| Abstract: | With the rapid growth of the Internet, the ability of users to create
and publish content has created active electronic communities that
provide a wealth of product information. However, the high volume of
reviews that are typically published for a single product makes harder
for individuals as well as manufacturers to locate the best reviews and
understand the true underlying quality of a product. In this paper, we
re-examine the impact of reviews on economic outcomes like product sales
and see how different factors affect social outcomes like the extent of
their perceived usefulness. Our approach explores multiple aspects of
review text, such as lexical, grammatical, semantic, and stylistic
levels to identify important text-based features. In addition, we also
examine multiple reviewer-level features such as average usefulness of
past reviews and the self-disclosed identity measures of reviewers that
are displayed next to a review. Our econometric analysis reveals that
the extent of subjectivity, informativeness, readability, and linguistic
correctness in reviews matters in influencing sales and perceived
usefulness. Reviews that have a mixture of objective, and highly
subjective sentences have a negative effect on product sales, compared
to reviews that tend to include only subjective or only objective
information. However, such reviews are considered more informative (or
helpful) by the users. By using Random Forest based classifiers, we show
that we can accurately predict the impact of reviews on sales and their
perceived usefulness. Reviews for products that have received widely
fluctuating reviews, also have reviews of widely fluctuating
helpfulness. In particular, we find that highly detailed and readable
reviews can have low helpfulness votes in cases when users tend to vote
negatively not because they disapprove of the review quality but rather
to convey their disapproval of the review polarity. We examine the
relative importance of the three broad feature categories:
`reviewer-related' features, `review subjectivity' features, and `review
readability' features, and find that using any of the three feature sets
results in a statistically equivalent performance as in the case of
using all available features. This paper is the first study that
integrates econometric, text mining, and predictive modeling techniques
toward a more complete analysis of the information captured by
user-generated online reviews in order to estimate their socio-economic
impact. Our results can have implications for judicious design of
opinion forums. |
| URI: | http://hdl.handle.net/2451/27680 |
| Appears in Collections: | CeDER Working Papers
|
All items in Faculty Digital Archive are protected by copyright, with all rights reserved.
|