首页> 外文期刊>Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis >Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
【24h】

Customers’ Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth

机译:从大量文本评论中得出的与诱导知识增长相关的客户意见挖掘

获取原文
           

摘要

Customers of various services are often invited to type a summarizing review via an Internet portal. Such reviews, written in natural languages, are typically unstructured, giving also a numeric evaluation within the scale “good” and “bad.” The more reviews, the better feedback can be acquired for improving the service. However, after accumulating massive data, the non-linearly growing processing complexity may exceed the computational abilities to analyze the text contents. Decision tree inducers like ic5/i can reveal understandable knowledge from data but they need the data as a whole. This article describes an application of windowing, which is a technique for generating dataset subsamples that provide enough information for an inducer to train a classifier and get results similar to those achieved by training a model from the entire dataset. The windowing results, significantly reducing the complexity of the learning problem, are demonstrated using hundreds of thousands reviews written in English by hotel-service customers. A user obtains knowledge represented by significant words. The results show classification accuracy errors, training and testing time, tree sizes, and words relevant for the review meaning in dependence on the training subsample size. Finally, a method of suitable training-set size estimation is suggested.
机译:通常会邀请各种服务的客户通过Internet门户键入摘要评论。用自然语言编写的此类评论通常是无结构的,并且还会在“好”和“不好”的范围内给出数字评估。评论越多,就可以获得更好的反馈以改善服务。但是,积累大量数据后,非线性增长的处理复杂性可能会超出分析文本内容的计算能力。诸如 c5 之类的决策树诱导程序可以从数据中揭示可理解的知识,但他们需要整个数据。本文介绍了加窗的一种应用程序,它是一种用于生成数据集子样本的技术,该子样本可为诱导者提供足够的信息以训练分类器,并获得与通过训练整个数据集中的模型获得的结果相似的结果。旅馆服务客户使用英语撰写的成千上万条评论显示了加窗结果,从而大大降低了学习问题的复杂性。用户获得以有效词表示的知识。结果显示分类准确性错误,训练和测试时间,树大小以及与复习相关的单词,这取决于训练子样本的大小。最后,提出了一种合适的训练集大小估计方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号