首页> 外文OA文献 >Automatic documents summarization using ontology based methodologies
【2h】

Automatic documents summarization using ontology based methodologies

机译:使用基于本体的方法自动文档摘要

摘要

When humans summarize a document they usually read the text first, understand it then attempt to write a summary. In essence, these processes require at least some basic level of background knowledge by the reader. The least of which would be the Natural Language the text is written in. In this thesis, an attempt is made to bridge the gap of machines understanding by proposing a framework backed with knowledge repositories constructed by humans and containing real human concepts. I use WordNet, a hierarchically-structured repository that was created by linguistic experts and is rich in its explicitly defined lexical relations. With WordNet, algorithms for computing the semantic similarity between terms were proposed and implemented. These algorithms were especially useful when applied to the application of Automatic Documents Summarization as shown with the obtained evaluation results. ududI also use Wikipedia, the largest encyclopedia to date. Because of its openness and structure, three problems had to be handled in this thesis: Extracting knowledge and features from Wikipedia, enriching the representation of text documents with the extracted features, and using them in the application of Automatic Summarization. When applying the features extractor to a summarization system, competitive evaluation results were obtained. ud
机译:当人们对文档进行总结时,他们通常会先阅读文本,先理解文本,然后再尝试编写摘要。本质上,这些过程需要读者至少具备一些基础知识。最少的是文本编写时所使用的自然语言。在本文中,我们试图通过提出一个由人类构建并包含真实人类概念的知识库组成的框架来弥合机器理解的鸿沟。我使用的是WordNet,这是一个由语言专家创建的层次结构的存储库,它具有明确定义的词汇关系。利用WordNet,提出并实现了用于计算术语之间语义相似度的算法。如获得的评估结果所示,这些算法在应用于自动文档摘要应用时特别有用。 ud ud我还使用Wikipedia,这是迄今为止最大的百科全书。由于其开放性和结构性,本文必须解决三个问题:从Wikipedia提取知识和特征,利用提取的特征丰富文本文档的表示形式,并将其用于自动汇总应用。将特征提取器应用于摘要系统时,可获得竞争性评估结果。 ud

著录项

  • 作者

    Bawakid Abdullah;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号