首页> 外文期刊>Business & information systems engineering >Intelligent User Assistance for Automated Data Mining Method Selection
【24h】

Intelligent User Assistance for Automated Data Mining Method Selection

机译:智能用户帮助自动化数据挖掘方法选择

获取原文
获取原文并翻译 | 示例
       

摘要

In any data science and analytics project, the task of mapping a domain-specific problem to an adequate set of data mining methods by experts of the field is a crucial step. However, these experts are not always available and data mining novices may be required to perform the task. While there are several research efforts for automated method selection as a means of support, only a few approaches consider the particularities of problems expressed in the natural and domain-specific language of the novice. The study proposes the design of an intelligent assistance system that takes problem descriptions articulated in natural language as an input and offers advice regarding the most suitable class of data mining methods. Following a design science research approach, the paper (ⅰ) outlines the problem setting with an exemplary scenario from industrial practice, (ⅱ) derives design requirements, (ⅲ) develops design principles and proposes design features, (ⅳ) develops and implements the IT artifact using several methods such as embeddings, keyword extractions, topic models, and text classifiers, (ⅴ) demonstrates and evaluates the implemented prototype based on different classification pipelines, and (ⅵ) discusses the results' practical and theoretical contributions. The best performing classification pipelines show high accuracies when applied to validation data and are capable of creating a suitable mapping that exceeds the performance of joint novice assessments and simpler means of text mining. The research provides a promising foundation for further enhancements, either as a stand-alone intelligent assistance system or as an add-on to already existing data science and analytics platforms.
机译:在任何数据科学和分析项目中,通过该领域的专家将域特定问题映射到足够的数据挖掘方法的任务是一个重要的步骤。但是,这些专家并非总是可用,并且可能需要数据采矿新手来执行任务。虽然有几项研究努力实现自动化方法选择作为支持手段,但只有几种方法考虑新手的自然和结构域语言中表达的问题的特殊性。该研究提出了一种智能辅助系统的设计,该系统在自然语言中阐述的问题描述作为输入,并提供有关最合适的数据挖掘方法的建议。在设计科学研究方法之后,本文(Ⅰ)概述了工业实践中示范场景的问题设置,(Ⅱ)导出设计要求,(Ⅲ)开发设计原则并提出设计特征,(ⅳ)开发和实现它使用诸如嵌入式,关键字提取,主题模型和文本分类器等几种方法的工件演示和评估基于不同分类管道的实现的原型,并且(ⅵ)讨论结果的实际和理论贡献。当应用于验证数据时,最佳执行的分类管道显示出高精度,并且能够创建超过联合新手评估的性能和更简单的文本挖掘手段的合适映射。该研究为进一步的增强提供了一个有希望的基础,也可以作为独立的智能援助系统或作为已经存在的数据科学和分析平台的附加组件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号