首页> 外文会议>Machine learning and data mining in pattern recognition >Pattern Mining with Natural Language Processing: An Exploratory Approach
【24h】

Pattern Mining with Natural Language Processing: An Exploratory Approach

机译:自然语言处理模式挖掘:一种探索性方法

获取原文
获取原文并翻译 | 示例

摘要

Pattern mining derives from the need of discovering hidden knowledge in very large amounts of data, regardless of the form in which it is presented. When it comes to Natural Language Processing (NLP), it arose along the humans' necessity of being understood by computers. In this paper we present an exploratory approach that aims at bringing together the best of both worlds. Our goal is to discover patterns in linguistically processed texts, through the usage of NLP state-of-the-art tools and traditional pattern mining algorithms.rnArticles from a Portuguese newspaper are the input of a series of tests described in this paper. First, they are processed by an NLP chain, which performs a deep linguistic analysis of text; afterwards, pattern mining algorithms Apriori and GenPrefixSpan are used. Results showed the applicability of sequential pattern mining techniques in textual structured data, and also provided several evidences about the structure of the language.
机译:模式挖掘源于在非常大量的数据中发现隐藏知识的需求,而不论其呈现形式如何。在自然语言处理(NLP)方面,它是随着人类被计算机理解而产生的。在本文中,我们提出了一种探索性方法,旨在将两个方面的优势融合在一起。我们的目标是通过使用NLP最新工具和传统模式挖掘算法来发现经过语言处理的文本中的模式。葡萄牙语报纸上的文章是本文描述的一系列测试的输入。首先,它们由NLP链处理,该链对文本进行深入的语言分析;之后,使用模式挖掘算法Apriori和GenPrefixSpan。结果显示了顺序模式挖掘技术在文本结构化数据中的适用性,并提供了有关语言结构的一些证据。

著录项

  • 来源
  • 会议地点 Leipzig(DE);Leipzig(DE)
  • 作者单位

    Spoken Language Systems Laboratory - L~2F/INESC-ID Instituto Superior Tecnico, Technical University of Lisbon R. Alves Redol, 9 - 2°- 1000-029 Lisboa, Portugal;

    rnDepartment of Computer Science and Engineering Instituto Superior Tecnico, Technical University of Lisbon Av. Rovisco Pais 1 - 1049-001 Lisboa, Portugal;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算机的应用;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号