Extracting Clusters of Specialist Terms from Unstructured Text

机译：从非结构化文本中提取专家术语的群集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatically identifying related specialist terms is a difficult and important task required to understand the lexical structure of language. This paper develops a corpus-based method of extracting coherent clusters of satellite terminology -terms on the edge of the lexicon - using co-occurrence networks of unstructured text. Term clusters are identified by extracting communities in the cooccurrence graph, after which the largest is discarded and the remaining words are ranked by centrality within a community. The method is tractable on large corpora, requires no document structure and minimal normalization. The results suggest that the model is able to extract coherent groups of satellite terms in corpora with varying size, content and structure. The findings also confirm that language consists of a densely connected core (observed in dictionaries) and systematic, se-mantically coherent groups of terms at the edges of the lexicon.

机译：自动识别相关专业术语是理解语言的词汇结构所需要的一项艰巨而重要的任务。本文开发了一种基于语料库的方法，该方法使用非结构化文本的共现网络来提取卫星术语（词典边缘上的术语）的相干簇。通过在共现图中提取社区来识别术语聚类，然后丢弃最大的聚类，然后按社区内的中心性对其余单词进行排名。该方法适用于大型语料库，不需要文档结构且标准化程度最低。结果表明，该模型能够从语料库中提取大小，内容和结构不同的卫星术语的连贯组。研究结果还证实，语言由紧密连接的核心（在词典中有发现）和词典边缘的术语系统，语义上连贯的词组组成。

著录项

来源
《Conference on empirical methods in natural language processing》|2014年|1426-1434|共9页
会议地点
作者
Aaron Gerow;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Extracting Hierarchical Relationship of Scientific and Technical Terms from Unstructured Text [J] . Hongqi Han, Lijun Zhu, Zhaofeng Zhang, Journal of information and computational science . 2015,第14期

机译：从非结构化文本中提取科学技术术语的层次关系
2. A Text Mining Approach to Extract Opinions from Unstructured Text [J] . Ananthi Sheshasaayee, R. Jayanthi Indian Journal of Science and Technology . 2015,第36期

机译：一种从非结构化文本中提取意见的文本挖掘方法
3. A Text Mining Approach to Extract Opinions from Unstructured Text [J] . Ananthi Sheshasaayee, R. Jayanthi Indian Journal of Science and Technology . 2015,第36期

机译：一种从非结构化文本中提取意见的文本挖掘方法
4. Extracting Clusters of Specialist Terms from Unstructured Text [C] . Aaron Gerow Conference on empirical methods in natural language processing . 2014

机译：从非结构化文本中提取专家术语的集群
5. Information Extraction of cyber security related terms and concepts from unstructured text. [D] . Lal, Ravendar. 2013

机译：从非结构化文本中提取与网络安全相关的术语和概念的信息。
6. Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system [O] . Beata Fonferko-Shadrach, Arron S Lacey, Angus Roberts, 2019

机译：使用自然语言处理从非结构化临床信函中提取结构性癫痫数据：ExECT（癫痫临床文本摘录）系统的开发和验证
7. Extracting Clusters of Specialist Terms from Unstructured Text [O] . Aaron Gerow 2015

机译：从非结构化文本中提取专家术语集群

Extracting Clusters of Specialist Terms from Unstructured Text

摘要

著录项

相似文献

相关主题

期刊订阅