首页> 外文会议>International conference on computer science and it applications >Concept-Based Compound Keyword Extraction Based on Using Sentential Distance, Conceptual Distance and Production Rules: Calculation of the Keyword Importance
【24h】

Concept-Based Compound Keyword Extraction Based on Using Sentential Distance, Conceptual Distance and Production Rules: Calculation of the Keyword Importance

机译:基于概念的复合关键字提取,基于使用绕距离,概念距离和生产规则:关键字重要性的计算

获取原文

摘要

Humans can read a document and conceptually organize its contents into few compound keywords that capture the essence of the topic of a document. Based on this information, this study proposes a method for extracting keywords that gives the gist of a document. It uses a set of academic papers as test data to set up a concept-based production rule for forming compound keywords even when author-provided keywords do not appear in the text body of a document. It also proposes a method of calculating the importance of keyword in order to refrain from extracting meaningless keywords. Also the validity of extracted keywords was tested using a data set of thesis paper tides and summaries in the field of natural language processing and speech recognition. Comparison of the author-provided keywords to the keyword results of the developed system showed that the developed system was very useful with an accuracy rate as good as up to 96 %.
机译:人类可以阅读文档并概念上将其内容组织成几个复合关键字,捕获文档主题的本质。本研究基于此信息,提出了一种提取给予文档的要点的关键字的方法。它使用一组学术文件作为测试数据,以建立一种基于概念的生产规则,即使当作者提供的关键字没有出现在文档的文本正文中也不会出现复合关键字。它还提出了一种计算关键字重要性的方法,以便避免提取毫无意义的关键字。还使用自然语言处理和语音识别领域的数据集和摘要测试了提取的关键字的有效性。作者提供的关键字对开发系统的关键字结果的比较表明,发达的系统非常有用,精度率高达96%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号