Table of Contents Table of Contents
Previous Page  1 /342 Next Page
Information
Show Menu
Previous Page 1 /342 Next Page
Page Background

1

臺大管理論叢

2017/5

27

卷 第

2S

1-28

DOI:10.6226/NTUMR.2017.JUN.F104-008

社群媒體中顧客知識之挖掘:意見探勘技術開發

Mining Consumer Knowledge from Social Media: Development of

an Opinion Mining Technique

摘 要

資訊科技與網際網路的普及,促成眾多新興應用的蓬勃發展,大量與多樣的資料迅速

累積,為了有效地自大量資料中挖掘出有趣的知識,巨量資料分析的概念孕育而生。

意見探勘是巨量資料分析的一項核心技術,其目的是自大量使用者產生資料中,分析

使用者對某些有興趣的實體(例如,產品、服務等)的主觀看法(例如,意見、情感、

評價等),並將這些資訊適當地摘要彙整,專換成結構化的顧客知識。本研究專注在

意見探勘中意見句子識別的工作,為改善傳統監督式學習法在準備訓練資料上所需投

入的大量人力與時間,本研究提出僅需要使用者提供少量的關鍵字,再輔以社群媒體

抓取來,未經人工標註的使用者產生資料,便能夠進行半監督式的學習,產生與監督

式學習相似甚至更佳的探勘結果。具體而言,本研究採用類別關聯規則演算法,達配

本研究設計的半監督式學習法,提出規則式意見句子識別技術

(R-OSI)

。根據實驗評估

結果,本研究的

R-OSI

技術能夠達到與監督式方法相近甚至更優良的效能。

【關鍵字】

意見句子識別、意見探勘、使用者產生資料、社群媒體分析、巨量資料分析

Abstract

With the popularization of information and network technology, many emerging and

interesting applications have been developed vigorously. The volume and variety of data

accumulates rapidly. These data are considered vital assets for supporting crucial business

intelligence applications. To better manage and use the valuable data, big data analytics,

which is the process of examining large datasets containing a variety of data types to uncover

hidden, previously unknown, and potentially useful patterns and knowledge, has become a

crucial research issues. In this study, we concentrate on an important big data analytic task,

namely opinion mining. We propose a rule-based opinion sentence identification (R-OSI)

technique, which can retrieve relevant review sentences to a specific product feature of

interest from a large volume of consumer reviews. The novelty of the proposed technique is

that it adopts a semi-supervised learning approach by requesting a user to provide keywords

to describe the target product feature. In addition, a set of unannotated consumer reviews are

retrieved from various social media websites. On the basis of the user-provided keywords

and the set of unannotated consumer reviews, the class association rule mining algorithm is

applied to learn a set of opinion sentence identification rules for the target product feature.

Our empirical evaluation results suggest that the proposed R-OSI technique achieves

promising performance in opinion sentence identification, even when a supervised learning

approach is adopted as the performance benchmark.

Keywords

opinion sentence identification, opinion mining, user generated content,

social media analytics, big data analytics

楊錦生

/

元智大學資訊管理學系助理教授

Chin-Sheng Yang

, Assistant Professor, Department of Information Management, Yuan Ze University

謝佩芸

/

元智大學資訊管理學系碩士

Pei-Yun Xie

, Master, Department of Information Management, Yuan Ze University

施曉萍

/

元智大學資訊管理學系碩士

Hsiao-Ping Shih

, Master, Department of Information Management, Yuan Ze University

Received, 2015/6, Final revision received 2016/6