首页> 外文会议>International conference on knowledge engineering and knowledge management >Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms

【24h】

Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms

机译：使用维基百科和遗传算法自动生成科学文献的主题元数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.

机译：使用关键短语对文档进行主题注释是一种证明科研文件主题的行之有效的方法。但是，很少有用关键字手动注释的科学文献。本文介绍了一种基于机器学习的科学文档自动关键词注释方法，该方法利用Wikipedia作为同义词库从文档内容中选择候选者，并部署遗传算法来学习用于对最可能的关键词进行排名和过滤的模型。报道的实验结果表明，根据与人类注释者之间的一致性评估，我们的方法的性能与人类所实现的性能相当，并且优于竞争对手的监督方法。

著录项

来源
《International conference on knowledge engineering and knowledge management 》|2012年|32-41|共10页
会议地点
作者
Arash Joorabchi; Abdulhussain E. Mahdi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
text mining; scientific digital libraries; subject metadata; keyphrase annotation; keyphrase indexing; Wikipedia; genetic algorithms;

机译：文本挖掘科学数字图书馆;主题元数据;关键字注释;关键字索引维基百科;遗传算法;

相似文献

外文文献
中文文献
专利

1. Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms [J] . Arash Joorabchi, Abdulhussain E. Mahdi Journal of Information Science . 2013 ,第3期

机译：使用维基百科和遗传算法对科学文件进行自动关键词注释
2. Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results [J] . Sam Grabus Information technology and libraries . 2020 ,第3期

机译：评估LONG-S对18世纪百科全书Britannica自动主题元数据生成结果的影响
3. Sentence Ordering Algorithm with Subject Criterion for Automatic Multi-Document Summarization [J] . Naser Jawas, Randy Cahya Wihandika, Agus Zainal Arifin Journal of information and computing science . 2013 ,第3期

机译：具有主题准则的句子排序算法，用于自动多文档摘要
4. Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms [C] . Arash Joorabchi, Abdulhussain E. Mahdi International Conference on Knowledge Engineering and Knowledge Management . 2012

机译：使用维基百科和遗传算法的科学文件自动主题元数据生成
5. Data mining revision controlled document history metadata for automatic classification. [D] . Maass, Dustin. 2013

机译：数据挖掘修订版本控制的文档历史记录元数据，用于自动分类。
6. Using XML Metadata to Enable the Automatic Generation and Processing of HTML Forms from XML Documents [O] . Anil K. Dubey, Henry C. Chueh 2001

机译：使用XML元数据启用从XML文档自动生成和处理HTML表单的功能
7. Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms [O] . Joorabchi, Arash, Mahdi, Abdulhussain E 2014

机译：使用维基百科和遗传算法对科学文献进行自动关键词注释

Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms

摘要

著录项

相似文献

相关主题

期刊订阅