首页> 外文会议>International Workshop on Biologically Inspired Approaches to Advanced Information Technology(BioADIT 2004) >Naieve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process
【24h】

Naieve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process

机译:基于蛋白质生物合成过程的单一文献的关键术语提取和文本摘要的幼稚算法

获取原文

摘要

Keywords are a simple way of describing a document, giving the reader some clues about its contents. However, sometimes they only categorize the text into a topic being more useful a summary. Keywords and abstracts are common in scientific and technical literature but most of the documents available (e.g., web pages) lack such help, so automatic keyword extraction and summarization tools are fundamental to fight against the "information overload" and improve the users' experience. Therefore, this paper describes a new technique to obtain keyphrases and summaries from a single document. With this technique, inspired by the process of protein biosynthesis, a sort of "document DNA" can be extracted and translated into a "significance protein" which both produces a set of keyphrases and acts on the document highlighting the most relevant passages. These ideas have been implemented into a prototype, publicly available in the Web, which has obtained really promising results.
机译:关键字是描述文档的简单方法,使读者提供关于其内容的一些线索。但是,有时它们只将文本分类为更有用的主题摘要。关键字和摘要在科学和技术文献中很常见,但大多数可用文件(例如,网页)缺乏此类帮助,因此自动关键字提取和摘要工具是对抗“信息超载”并改善用户的体验。因此,本文介绍了从单个文档获取关键词和摘要的新技术。利用这种技术,通过蛋白质生物合成方法的启发,可以提取一种“文献DNA”并将其翻译成“意义蛋白质”,两者都产生一组关键酶,并在突出最相关通道的文件上作用。这些想法已被实施为一个原型,在网上公开提供,这已经获得了真正有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号