首页> 外文期刊>Expert Systems with Application >Textual data mining for industrial knowledge management and text classification: A business oriented approach
【24h】

Textual data mining for industrial knowledge management and text classification: A business oriented approach

机译:用于工业知识管理和文本分类的文本数据挖掘:一种面向业务的方法

获取原文
获取原文并翻译 | 示例
       

摘要

Textual databases are useful sources of information and knowledge and if these are well utilised then issues related to future project management and product or service quality improvement may be resolved. A large part of corporate information, approximately 80%, is available in textual data formats. Text Classification techniques are well known for managing on-line sources of digital documents. The identification of key issues discussed within textual data and their classification into two different classes could help decision makers or knowledge workers to manage their future activities better. This research is relevant for most text based documents and is demonstrated on Post Project Reviews (PPRs) which are valuable source of information and knowledge. The application of textual data mining techniques for discovering useful knowledge and classifying textual data into different classes is a relatively new area of research. The research work presented in this paper is focused on the use of hybrid applications of text mining or textual data mining techniques to classify textual data into two different classes. The research applies clustering techniques at the first stage and Apriori Association Rule Mining at the second stage. The Apriori Association Rule of Mining is applied to generate Multiple Key Term Phrasal Knowledge Sequences (MKTPKS) which are later used for classification. Additionally, studies were made to improve the classification accuracies of the classifiers i.e. C4.5, K-NN, Naive Bayes and Support Vector Machines (SVMs). The classification accuracies were measured and the results compared with those of a single term based classification model. The methodology proposed could be used to analyse any free formatted textual data and in the current research it has been demonstrated on an industrial dataset consisting of Post Project Reviews (PPRs) collected from the construction industry. The data or information available in these reviews is codified in multiple different formats but in the current research scenario only free formatted text documents are examined. Experiments showed that the performance of classifiers improved through adopting the proposed methodology.
机译:文本数据库是有用的信息和知识来源,如果利用得当,则可以解决与未来项目管理以及产品或服务质量提高有关的问题。公司信息的大部分(约80%)以文本数据格式提供。文本分类技术用于管理数字文档的在线来源是众所周知的。识别文本数据中讨论的关键问题并将其分为两个不同的类别可以帮助决策者或知识工作者更好地管理其未来活动。这项研究与大多数基于文本的文档有关,并且在后期项目审查(PPR)上得到了证明,这是有价值的信息和知识来源。文本数据挖掘技术在发现有用知识和将文本数据分类到不同类别中的应用是一个相对较新的研究领域。本文提出的研究工作集中在使用文本挖掘或文本数据挖掘技术的混合应用程序来将文本数据分为两个不同的类别。该研究在第一阶段应用聚类技术,在第二阶段应用Apriori关联规则挖掘。应用Apriori关联挖掘规则来生成多个关键术语短语知识序列(MKTPKS),这些序列随后将用于分类。另外,进行了研究以提高分类器即C4.5,K-NN,朴素贝叶斯和支持向量机(SVM)的分类精度。测量分类精度,并将结果与​​基于单项的分类模型的结果进行比较。所提出的方法可用于分析任何自由格式的文本数据,并且在当前的研究中,该方法已在包括从建筑业收集的后期项目评论(PPR)的工业数据集中进行了证明。这些评论中可用的数据或信息以多种不同的格式进行编纂,但在当前的研究场景中,仅检查自由格式的文本文档。实验表明,采用所提出的方法可以提高分类器的性能。

著录项

  • 来源
    《Expert Systems with Application》 |2012年第5期|p.4729-4739|共11页
  • 作者

    N. Ur-Rahman; J.A. Harding;

  • 作者单位

    Wotfson School of Mechanical and Manufacturing Engineering, Loughborough University, Loughborough, Leicestershire LE11 3TU, UK;

    Wotfson School of Mechanical and Manufacturing Engineering, Loughborough University, Loughborough, Leicestershire LE11 3TU, UK;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    textual data mining; text mining; post project reviews;

    机译:文本数据挖掘;文本挖掘;发布项目评论;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号