Mining and Summarizing Customer Reviews
Authors- Minqing Hu and Bing Liu
Year – 2004
Published in- Proceedings of the 10th ACM International conference on knowledge discovery and data mining.
Link - http://sifaka.cs.uiuc.edu/course/591cxz04f/peng1.pdf
Importance to my Research – Very High
MY REVIEW
This paper proposes a product review classification system that downloads reviews from the web and classifies them as positive and negative based on each product feature. In doing so, the framework automatically identifies different features about a product or service being reviewed and then mines the relevant sentence to look for opinions. Finally the system presents the user with a summary of reviews for one particular product. This paper downloaded the product reviews from Amazon and C|Net for their experiments. The proposed system can be used by product manufactuers for improving their products as well as customers in deciding which product to buy.
Before I begin the review of this paper I would say this is one of the best written papers, it is so well organized and thought about before writing, that any questions that I had in mind were answered at the right time. I should congratulate the authors for their excellent effort in writing this paper, which made my reading a pleasurable experience.
Some Future Directions
-
It would be a good idea to provide some additional statistics based on the review classification e.g. I would be interested in knowing how many people say that the quality of picture is acceptable under bright sunlight but not very good in dim light. I guess the current system only identified picture quality as a feature and puts all the reviews related to the picture quality underneath. Am I correct?
more details coming soon…
Cite this article as
Critical Review on “Mining and Summarizing Customer Reviews” by V. Potdar, 10th Mar, 2008. Available Online – http://drvidy.wordpress.com/2008/03/10/mining-and-summarizing-customer-reviews/
Abstract
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.
Important Terms
-
Feature Based Summarization
-
Cognitive Linguistics
-
Sentiment Classifiers
-
Text Summarization
-
Subject Genre Classification
-
Terminology Finding
Reference Sheet
In this research, we study the problem of generating feature-based summaries of customer reviews of products sold online. Here, features broadly mean product features (or attributes) and functions. Given a set of customer reviews of a particular product, the task involves three subtasks:
- Identifying features of the product that customers have expressed their opinions on (called product features);
- For each feature, identifying review sentences that give positive or negative opinions; and
- Producing a summary using the discovered information.
Our task is different from traditional text summarization [15, 39, 36] in a number of ways.
First of all, a summary in our case is structured rather than another (but shorter) free text document as produced by most text summarization systems.
Second, we are only interested in features of the product that customers have opinions on and also whether the opinions are positive or negative.
We do not summarize the reviews by selecting or rewriting a subset of the original sentences from the reviews to capture their main points as in traditional text summarization.
Genre classification classifies texts into different styles, e.g., “editorial”, “novel”, “news”, “poem” etc. Although some techniques for genre classification can recognize documents that express opinions [23, 24, 14], they do not tell whether the opinions are positive or negative.
A more closely related work is [17], in which the authors investigate sentence subjectivity classification and concludes that the presence and type of adjectives in a sentence is indicative of whether the sentence is subjective or objective.
Useful References


