Keyword Extraction and Headline Generation Using Novel Word Features

机译：使用新颖的Word功能提取关键字并生成标题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generate a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and in-fobox information in each article to derive a set of novel word features which reflect the document's background knowledge. These newly introduced word features offer valuable indications on individual words' importance in the input document. They serve as nice complements to the traditional word features derivable from explicit information of a document. In addition, we also introduce a word-document fitness feature to characterize the influence of a document's genre on the keyword extraction and headline generation process. We study the effectiveness of these novel word features for keyword extraction and headline generation by experiments and have obtained very encouraging results.

机译：我们介绍了几种新颖的单词功能，用于关键字提取和标题生成。这些新单词特征是根据Wikipedia提供的文档的背景知识得出的。给定一个文档，要从Wikipedia中获取其背景知识，我们首先会基于文档中存在的关键事实生成一个查询，以搜索Wikipedia语料库。然后，我们使用查询在Wikipedia语料库中查找与文档内容密切相关的文章。使用Wikipedia搜索结果文章集，我们提取每篇文章中的内联，外联，类别和内装信息，以得出反映文档背景知识的一组新颖的单词功能。这些新引入的单词功能为输入文档中各个单词的重要性提供了有价值的指示。它们是对文档显式信息派生的传统单词功能的很好补充。此外，我们还引入了单词文档适应度功能，以表征文档类型对关键字提取和标题生成过程的影响。我们通过实验研究了这些新颖的单词特征对关键词提取和标题生成的有效性，并获得了令人鼓舞的结果。

著录项

来源
《Innovative applications of artificial intelligence conference;AAAI conference on artificial intelligence;IAAI-10;Symposium on educational advances in artificial intelligence;AAAI-10;EAAI-10》|2011年|p.1461-1466|共6页
会议地点
作者
Songhua Xu; Shaohui Yang; Francis C.M. Lau;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Keyword grouping and image feature extraction for two-way search between words and images [J] . Yasuhide Mori, Hironobu Takahashi, Ryuichi Oka 電子情報通信学会技術研究報告. マルチメディア·仮想環境基礎 . 2000,第184期

机译：关键字分组和图像特征提取，用于单词和图像之间的双向搜索
2. Keyword grouping and image feature extraction for two-way search between words and images [J] . Yasuhide Mori, Hironobu Takahashi, Ryuichi Oka 電子情報通信学会技術研究報告. 画像工学. Image Engineering . 2000,第180期

机译：关键字分组和图像特征提取，用于单词和图像之间的双向搜索
3. Keyword grouping and image feature extraction for two-way search between words and images [J] . Yasuhide Mori, Hironobu Takahashi, Ryuichi Oka 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2000,第182期

机译：关键字分组和图像特征提取，用于单词和图像之间的双向搜索
4. Keyword Extraction and Headline Generation Using Novel Word Features [C] . Songhua Xu, Shaohui Yang, Francis C. M. Lau AAAI Conference on Artificial Intelligence . 2010

机译：使用新颖的词特征的关键字提取和标题生成
5. Keywords at Work: Investigating Keyword Extraction in Social Media Applications [D] . Lahiri, Shibamouli. 2018

机译：工作中的关键字：调查社交媒体应用程序中的关键字提取
6. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction [O] . Elham Najafi, Amir H. Darooneh -1

机译：文本中词的分形模式：一种自动关键词提取方法
7. Keyword Extraction using the Word Co-occurrence Network Properties that is Independent of Languages and Document Types and Its Evaluation by Prediction of Headline Words [O] . Yuki YAMAMOTO, Ryohei ORIHARA 2009

机译：关键字提取使用与语言和文档类型无关的单词共同发生网络属性及其通过预测标题字的评估

Keyword Extraction and Headline Generation Using Novel Word Features

摘要

著录项

相似文献

相关主题

期刊订阅