ABSTRACT
Mining
opinion targets and opinion words from online reviews are important tasks for
fine-grained opinion mining, the key component of which involves detecting
opinion relations among words. To this end, this paper proposes a novel
approach based on the partially-supervised alignment model, which regards
identifying opinion relations as an alignment process. Then, a graph-based
co-ranking algorithm is exploited to estimate the confidence of each candidate.
Finally, candidates with higher confidence are extracted as opinion targets or
opinion words. Compared to previous methods based on the nearest-neighbor
rules, our model captures opinion relations more precisely, especially for
long-span relations. Compared to syntax-based methods, our word alignment model
effectively alleviates the negative effects of parsing errors when dealing with
informal online texts. In particular, compared to the traditional unsupervised
alignment model, the proposed model obtains better precision because of the
usage of partial supervision. In addition, when estimating candidate
confidence, we penalize higher-degree vertices in our graph-based co-ranking
algorithm to decrease the probability of error generation. Our experimental
results on three corpora with different sizes and languages show that our
approach effectively outperforms state-of-the-art methods.
AIM
The
main aim of this paper is a novel approach based on the partially-supervised
alignment model, which regards identifying opinion relations as an alignment
process. Then, a graph-based co-ranking algorithm is exploited to estimate the
confidence of each candidate. Finally, candidates with higher confidence are
extracted as opinion targets or opinion words.
SCOPE
The
scope of this paper is our experimental results on three corpora with different
sizes and languages show that our approach effectively outperforms
state-of-the-art methods
EXISTING SYSTEM
Opinion
target and opinion word extraction are not new tasks in opinion mining. There
is significant effort focused on these tasks. They can be divided into two
categories: sentence-level extraction and corpus level extraction according to
their extraction aims. In sentence-level extraction, the task of opinion
target/ word extraction is to identify the opinion target mentions or opinion
expressions in sentences. Thus, these tasks are usually regarded as
sequence-labeling problems. Intuitively, contextual words are selected as the
features to indicate opinion targets/words in sentences. Most
previous approaches adopted a collective unsupervised extraction framework. As
mentioned in our first section, detecting opinion relations and calculating
opinion associations among words are the key component of this type of method.
adopted the co-occurrence frequency of opinion targets and opinion words to
indicate their opinion associations. Exploited nearest-neighbor rules to
identify opinion relations among words. Next, frequent and explicit product
features were extracted using a bootstrapping process. Only the use of
co-occurrence information or nearest-neighbor rules to detect opinion relations
among words could not obtain precise results.
DISADVANTAGES
· This
strategy cannot obtain precise results because there exist long-span modified
relations and diverse opinion expressions.
· Some
errors are extracted by an iteration, they would not be filtered out in
subsequent iterations.
PROPOSED
SYSTEM
In this paper, propose a method
based on a monolingual word alignment model (WAM). An opinion target can find
its corresponding modifier through word alignment. The
WAM is more robust because it does not need to parse informal texts. In
addition, the WAM can integrate several intuitive factors, such as word
co-occurrence frequencies and word positions, into a unified model for
indicating the opinion relations among words. Thus, we expect to obtain more
precise results on opinion relation identification. A
constrained EM algorithm based on hill-climbing is then performed to determine
all of the alignments in sentences, where the model will be consistent with
these links as much as possible. A random walk based
co-ranking algorithm is then proposed to estimate each candidate’s confidence
on the graph. In this process, we penalize high-degree vertices to weaken their
impacts and decrease the probability of a random walk running into unrelated
regions on the graph. Meanwhile, we calculate the prior knowledge of candidates
for indicating some noises and incorporating them into our ranking algorithm to
make collaborated operations on candidate confidence estimations.
ADVANTAGES
- The advantages of the word alignment model for opinion relation identification, but it also has a more precise performance because of the use of partial supervision
- The confidence of each candidate is estimated in a global process with graph co-ranking. Intuitively, the error propagation is effectively alleviated.
System
Configuration
Hardware Requirements
- Speed - 1.1 Ghz
- Processor - Pentium IV
- RAM - 512 MB (min)
- Hard Disk - 40 GB
- Key Board - Standard Windows Keyboard
- Mouse - Two or Three Button Mouse
- Monitor - LCD/LED
Software
requirements
- Operating System : Windows 7
- Front End : ASP.Net and C#
- Database : MSSQL
- Tool : Microsoft Visual studio
References
Kang
Liu, Liheng Xu, Jun Zhao,“ Co-extracting Opinion Targets and
Opinion Words from Online Reviews Based on the Word Alignment Model” IEEE
Transactions on Knowledge and Data Engineering, Volume 27 Issue 3 July 2014.
No comments:
Post a Comment