首页 > 资料专栏 > 论文 > 技研论文 > 互联网论文 > MBA论文_基于机器学习的outlier分析与在线汽车评论挖掘研究

MBA论文_基于机器学习的outlier分析与在线汽车评论挖掘研究

地球机器***
V 实名认证
内容提供者
热门搜索
资料大小:1507KB(压缩后)
文档格式:DOC
资料语言:中文版/英文版/日文版
解压密码:m448
更新时间:2020/4/15(发布于广东)
阅读:3
类型:金牌资料
积分:--
推荐:升级会员

   点此下载 ==>> 点击下载文档


文本描述
摘要
对信息过载的Web2.0背景下的电子商务而言,如何处理大量的网络口碑
(Online WOM),快速准确获取帮助顾客做出满意购买决策或是帮助供应商改进
产品和服务的知识日益成为业界和学界关注的焦点。论文围绕这种需求,选择高
价值、专业化且尚无成熟互联网销售模式的汽车商品作为研究对象,以奥迪Q5
的网络口碑数据为例,主要解决三个具体问题:1)作为具体的,较为成熟的一
型产品,奥迪Q5的产品优势和劣势各是什么?2)在线口碑是否能反映顾客在
购买及使用产品过程中的真实情感,如果可以,怎样刻画?3)如何在这些口碑
数据中找到离群点,并分析这些点的特征和形成机制以产生异类知识?全文逻辑
如下:首先,原始数据经过NLP过程,所有产品属性以顾客集体意见倾向性排
序,明晰了产品的优劣势,与此同时,也产生了进一步研究所需要的基础数据集
此外,面对目前学界对网络口碑有用性的争论,论文强调网络口碑只有反映顾客
在购买和使用产品过程中的真实情感才有价值。据此,论文利用回答第一个问题
时产生的形式为{意见持有者,产品属性,分句,极性}的基础数据集,设计了个
人差评度和修正个人差评度两个指标。进而由多重线性回归找到了在线评论刻画
顾客真实情感的方法。这证明论文由在线评论数据明晰的产品优劣势是可靠的,
同时也为后文构建有效坐标空间,进而发现异类知识做了准备。最后,论文以
outlier视角对口碑数据在“个人—集体意见”和“评论—评价”两个坐标空间内
实施了例外挖掘,尝试梳理outlier样本的特征以获取异类知识。论文主要的实
践贡献在于明晰了奥迪Q5不同维度的产品优劣势,为顾客购买决策和供应商改
进产品提供了有益参考。另外,研究从outlier视角,发现异类口碑的不满情绪
与汽车油耗高度正相关,为从业者探索新的顾客行为规律提供了可能。,论文的
主要理论贡献在于:管理视角下,以通用的知识发现框架为基础,综合成熟的离
群点检测和知识分类提出了网络口碑异类知识发现过程。技术视角下,论文设计
的以产品属性的集体差评度作为权重,代替修饰词典参与到个人评论情感计算的
方法能够在降低情感计算主观性的同时,显著且稳定的刻画顾客的真实情感
关键词:网络口碑;自然语言处理;离群点;异类知识发现
研究类型:应用研究
Abstract
As for the e-commerce in the context of Web 2.0 with information overload, how
to deal with a lot of online WOM, and rapidly and accurately obtain the knowledge
that help customers make satisfied purchase decision or assist suppliers improve
products and services has increasingly become the focus of the industry and academia.
According to the demand, taking high-valued and professional automobiles with
immature Internet sales mode as the research object, taking the data of online WOM
of Audi Q5 as an example, the paper aims to solve three problems: 1) as a specific and
mature product, what are the advantages and disadvantages of Audi Q5 2) Can online
WOM reflect the real emotion of customers when purchasing and using a product If
it can, how does it reflect that 3) How to find outliers in online WOM and analyze
the features and formation mechanism of these points to generate heterogeneous
knowledge The logic of the paper is as follows: first, after NLP processing, all
product attributes were ordered according to customers’ collective opinions to clarify
the advantages and disadvantages of products and obtain the basic data set necessary
for further study. In addition, in the face of the dispute among academia about
usefulness of online WOM, the paper emphasized that online WOM was valuable
only when it reflected the real emotion of customers when purchasing and using a
product. Accordingly, on the basis of the basic data set that formed during answering
the first question {opinion holder, product attributes, clause, polarity}, two indicators
were designed, namely individual negative review and modified individual negative
review, so as to find the method of evaluating the real emotion of customers online
through multiple linear regression. This proved that it was reliable to evaluate the
advantages and disadvantages of products with explicit online comment data.
Meanwhile, it also laid a foundation for constructing effective coordinate space and
further find heterogeneous knowledge. Finally, from the perspective of outlier,
exception mining was implemented to the data of WOM in two coordinate spaces,
that is, individual-collective opinion and comment-evaluation, so as to comb the
features of outlier samples and obtain heterogeneous knowledge. The main practical。