AUTOMATIC TERM EXTRACTION AND DOCUMENT SIMILARITY IN SPECIAL TEXT CORPORA

机译：特殊文本公司中的自动术语提取和文档相似性

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper confirms that the performance of a state-of-the-art automatic term extraction method on a computer science corpus is similar to previously published performance data on a medical corpus. The extracted terms are then used to estimate the similarity of papers in the computer science corpus using the standard Vector Space Model. The precision of retrieval using a term-based representation is compared with that of a word-based representation, and a link-based similarity metric based on the overlap of the local neighborhoods of the papers in the citation graph. The term-based approach offers comparable performance to the word-based approach, but potentially with a much smaller vocabulary size.

机译：本文确认，计算机科学语料库上最先进的自动术语提取方法的性能类似于先前发布的医学语料库上的性能数据。然后使用提取的项使用标准向量空间模型来估计计算机科学语料库中论文的相似性。将使用基于术语的表示形式的检索精度与基于单词的表示形式的检索精度以及基于引文图中论文的局部邻域的重叠的基于链接的相似性度量进行比较。基于术语的方法可以提供与基于单词的方法相当的性能，但潜在的词汇量却要小得多。

著录项

来源
《Pacific Association for Computational Linguistics Conference(PACLING'03); 20030822-25; Halifax(CA)》|2003年|P.275-284|共10页
会议地点 Halifax(CA)
作者
E. Milios; Y. Zhang; B. He; L. Dong;
展开▼
作者单位

Faculty of Computer Science, Dalhousie University, Halifax, Canada B3E 1W5;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言;
关键词
natural language processing; automatic term extraction; vector space model;

机译：自然语言处理自动术语提取矢量空间模型;
入库时间 2022-08-26 14:15:03

相似文献

外文文献
中文文献
专利

1. Deep Text Mining for Automatic Keyphrase Extraction from Text Documents [J] . Muhammad Abulaish, Jahiruddin, Lipika Dey Journal of Intelligent Systems . 2011,第4期

机译：深度文本挖掘，用于从文本文档中自动提取关键词
2. Automatic Extraction Of The Fine Category Of Person Named Entities From Text Corpora [J] . Tri, Thanh NGUYEN, Akira SHIMAZU IEICE Transactions on Information and Systems . 2007,第10期

机译：从文本语料库中自动提取人员命名实体的精细类别
3. In no uncertain terms: a dataset for monolingual and multilingual automatic term extraction from comparable corpora [J] . Ayla Rigouts Terryn, Veronique Hoste, Els Lefever Language Resources and Evaluation . 2020,第2期

机译：没有不确定的术语：用于单晶体和多语言自动术语提取的数据集从可比的语料库中提取
4. AUTOMATIC TERM EXTRACTION AND DOCUMENT SIMILARITY IN SPECIAL TEXT CORPORA [C] . E. Milios, Y. Zhang, B. He, Pacific Association for Computational Linguistics Conference . 2003

机译：特殊文本语料库中的自动术语提取和文档相似性
5. Automatic term extraction and document similarity in special text corpora. [D] . Dong, Li. 2002

机译：特殊文本语料库中的自动术语提取和文档相似性。
6. Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text [O] . Arwa Bin Raies, Hicham Mansour, Roberto Incitti, -1

机译：结合位置权重矩阵和文档项矩阵从自由文本中高效提取甲基化基因与疾病的关联
7. Automatic extraction of property norm-like data from large text corpora [O] . Colin Kelly, A Barry Devereux, B Anna Korhonena 2013

机译：从大型文本语料库中自动提取属性范数数据

AUTOMATIC TERM EXTRACTION AND DOCUMENT SIMILARITY IN SPECIAL TEXT CORPORA

摘要

著录项

相似文献

相关主题

期刊订阅