Yang, C. S., Xie, P. Y., and Shih, H. P. 2017. Mining Consumer Knowledge from Social Media: Development of an Opinion Mining Technique. NTU Management Review, 27 (2S): 1-28. doi:10.6226/NTUMR.2017.JUN.F104-008   
Mining Consumer Knowledge from Social Media: Development of an Opinion Mining Technique
楊錦生 / 元智大學資訊管理學系助理教授
Chin-Sheng Yang, Assistant Professor, Department of Information Management, Yuan Ze University

謝佩芸 / 元智大學資訊管理學系碩士
Pei-Yun Xie, Master, Department of Information Management, Yuan Ze University

施曉萍 / 元智大學資訊管理學系碩士
Hsiao-Ping Shih, Master, Department of Information Management, Yuan Ze University

資訊科技與網際網路的普及,促成眾多新興應用的蓬勃發展,大量與多樣的資料迅速 累積,為了有效地自大量資料中挖掘出有趣的知識,巨量資料分析的概念孕育而生。意見探勘是巨量資料分析的一項核心技術,其目的是自大量使用者產生資料中,分析使用者對某些有興趣的實體(例如,產品、服務等)的主觀看法(例如,意見、情感、評價等),並將這些資訊適當地摘要彙整,專換成結構化的顧客知識。本研究專注在意見探勘中意見句子識別的工作,為改善傳統監督式學習法在準備訓練資料上所需投入的大量人力與時間,本研究提出僅需要使用者提供少量的關鍵字,再輔以社群媒體抓取來,未經人工標註的使用者產生資料,便能夠進行半監督式的學習,產生與監督式學習相似甚至更佳的探勘結果。具體而言,本研究採用類別關聯規則演算法,達配本研究設計的半監督式學習法,提出規則式意見句子識別技術 (R-OSI)。根據實驗評估結果,本研究的R-OSI 技術能夠達到與監督式方法相近甚至更優良的效能。
中文關鍵字意見句子識別, 意見探勘, 使用者產生資料, 社群媒體分析, 巨量資料分析

With the popularization of information and network technology, many emerging and interesting applications have been developed vigorously. The volume and variety of data accumulates rapidly. These data are considered vital assets for supporting crucial business intelligence applications. To better manage and use the valuable data, big data analytics, which is the process of examining large datasets containing a variety of data types to uncover hidden, previously unknown, and potentially useful patterns and knowledge, has become a crucial research issues. In this study, we concentrate on an important big data analytic task, namely opinion mining. We propose a rule-based opinion sentence identification (R-OSI) technique, which can retrieve relevant review sentences to a specific product feature of interest from a large volume of consumer reviews. The novelty of the proposed technique is that it adopts a semi supervised learning approach by requesting a user to provide keywords to describe the target product feature. In addition, a set of unannotated consumer reviews are retrieved from various social media websites. On the basis of the user-provided keywords and the set of unannotated consumer reviews, the class association rule mining algorithm is applied to learn a set of opinion sentence identification rules for the target product feature. Our empirical evaluation results suggest that the proposed R-OSI technique achieves promising performance in opinion sentence identification, even when a supervised learning approach is adopted as the performance benchmark.
英文關鍵字opinion sentence identification, opinion mining, user generated content, social media analytics, big data analytics