首页> 外文会议>International conference on artificial intelligence;ICAI 2011 >Developing a Concept Extraction System for Turkish
【24h】

Developing a Concept Extraction System for Turkish

机译:开发土耳其语的概念提取系统

获取原文

摘要

In recent years, due to the vast amount of available electronic media and data, the necessity of analyzing electronic documents automatically was increased. In order to assess if a document contains valuable information or not, concepts, key phrases or main idea of the document have to be known. There are some studies on extracting key phrases or main ideas of documents for Turkish. However, to the best of our knowledge, there is no concept extraction system for Turkish although such systems exist for well-known languages. In this paper, a concept extraction system is proposed for Turkish. By applying some statistical and Natural Language Processing methods, documents are identified by concepts. As a result, the system generates concepts with 51% success, but it generates more concepts than it should be. Since concepts are abstract entities, in other words they do not have to be written in the texts as they appear, assigning concepts is a very difficult issue. Moreover, if we take into account the complexity of the Turkish language this result can be seen as quite satisfactory.
机译:近年来,由于大量可用的电子媒体和数据,增加了自动分析电子文档的必要性。为了评估文档是否包含有价值的信息,必须知道文档的概念,关键短语或主要思想。关于提取土耳其语的关键短语或文档主要思想的一些研究。然而,据我们所知,没有针对土耳其语的概念提取系统,尽管这种系统存在于知名语言中。本文提出了一种针对土耳其语的概念提取系统。通过应用一些统计和自然语言处理方法,可以通过概念识别文档。结果,系统生成的概念成功率达到51%,但是生成的概念却超出了预期。由于概念是抽象的实体,换句话说,它们不必在出现时就写在文本中,因此分配概念是一个非常困难的问题。而且,如果考虑到土耳其语的复杂性,这个结果可以说是相当令人满意的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号