TY - JOUR
T1 - Incorporating appraisal expression patterns into topic modeling for aspect and sentiment word identification
AU - Zheng, Xiaolin
AU - Lin, Zhen
AU - Wang, Xiaowei
AU - Lin, Kwei Jay
AU - Song, Meina
PY - 2014/5
Y1 - 2014/5
N2 - With the considerable growth of user-generated content, online reviews are becoming extremely valuable sources for mining customers' opinions on products and services. However, most of the traditional opinion mining methods are coarse-grained and cannot understand natural languages. Thus, aspect-based opinion mining and summarization are of great interest in academic and industrial research. In this paper, we study an approach to extract product and service aspect words, as well as sentiment words, automatically from reviews. An unsupervised dependency analysis-based approach is presented to extract Appraisal Expression Patterns (AEPs) from reviews, which represent the manner in which people express opinions regarding products or services and can be regarded as a condensed representation of the syntactic relationship between aspect and sentiment words. AEPs are high-level, domain-independent types of information, and have excellent domain adaptability. An AEP-based Latent Dirichlet Allocation (AEP-LDA) model is also proposed. This is a sentence-level, probabilistic generative model which assumes that all words in a sentence are drawn from one topic - a generally true assumption, based on our observation. The model also assumes that every review corpus is composed of several mutually corresponding aspect and sentiment topics, as well as a background word topic. The AEP information is incorporated into the AEP-LDA model for mining aspect and sentiment words simultaneously. The experimental results on reviews of restaurants, hotels, MP3 players, and cameras show that the AEP-LDA model outperforms other approaches in identifying aspect and sentiment words.
AB - With the considerable growth of user-generated content, online reviews are becoming extremely valuable sources for mining customers' opinions on products and services. However, most of the traditional opinion mining methods are coarse-grained and cannot understand natural languages. Thus, aspect-based opinion mining and summarization are of great interest in academic and industrial research. In this paper, we study an approach to extract product and service aspect words, as well as sentiment words, automatically from reviews. An unsupervised dependency analysis-based approach is presented to extract Appraisal Expression Patterns (AEPs) from reviews, which represent the manner in which people express opinions regarding products or services and can be regarded as a condensed representation of the syntactic relationship between aspect and sentiment words. AEPs are high-level, domain-independent types of information, and have excellent domain adaptability. An AEP-based Latent Dirichlet Allocation (AEP-LDA) model is also proposed. This is a sentence-level, probabilistic generative model which assumes that all words in a sentence are drawn from one topic - a generally true assumption, based on our observation. The model also assumes that every review corpus is composed of several mutually corresponding aspect and sentiment topics, as well as a background word topic. The AEP information is incorporated into the AEP-LDA model for mining aspect and sentiment words simultaneously. The experimental results on reviews of restaurants, hotels, MP3 players, and cameras show that the AEP-LDA model outperforms other approaches in identifying aspect and sentiment words.
KW - Appraisal expression pattern
KW - Aspect and sentiment analysis
KW - Dependency analysis
KW - Opinion mining
KW - Topic modeling
UR - http://www.scopus.com/inward/record.url?scp=84897411279&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2014.02.003
DO - 10.1016/j.knosys.2014.02.003
M3 - 文章
AN - SCOPUS:84897411279
SN - 0950-7051
VL - 61
SP - 29
EP - 47
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -