文档介绍:Review mining for business intelligence
1
第一页,共四十七页。
Outline
Review mining from online media
Objectives and challe discussions
9
Intuitively
Hot discussions = outstanding sales performance
A real example from the movie sector
The Da Vinci Code
Over The Hedge
User Rating
第九页,共四十七页。
Our solution: Product sales prediction with word-of-mouth
10
ARSA: A sentiment-aware model
Two factors:
Box office revenue of the proceeding days
People’s sentiment about the movie
Influence of past BO revenue
Effect of sentiment
第十页,共四十七页。
Experiment settings for sales prediction
11
Experiment settings
Datasets
Blog entries on movies
From May 1, 2006 to August 8, 2006
Google blog search
Box office revenue data
Evaluation metric
第十一页,共四十七页。
Empirical study for sales prediction
12
Parameter selections
K, P, Q
Trend and real cases
Comparison with alternative methods
Without sentiment: autoregressive model
With volume: replace sentiment with volume scalar
Comparison with feature selection methods
Bag-of-words
第十二页,共四十七页。
Adaptive sentiment analysis
S-PLSA+: Adaptive sentiment analysis
Capture the hidden sentiment factors in the reviews
Incrementally update parameters as more data become available
Quasi-Bayesian estimation
Batch training
Incremental change
Application to sales performance prediction
13
第十三页,共四十七页。
Review quality mining
14
Popular solution
Aggregated score
第十四页,共四十七页。
Problems of existing solutions
15
Few votes for new posts
Valuable reviews being buried in the large number of low-quality reviews
Monopoly
Reviews on the top receive more attention
Spam voting
Motivated by some interests
第十五页,共四十七页。
A non-linear model for mining review quality
16
HelpMeter: A non-linear regression model for predicting the helpfulness of online reviews
Features and contributions:
Support helpfulness prediction
Detect the review quality irrespective of publishing time
Integrate most influential factors that may affect the helpfulness value
A