A Study on the Interplay Between the Corpus Size and Parameters of a Distributional Model for Term Classification

机译：术语分类的分布模型的语料库大小与参数之间的相互作用研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose and evaluate a method for identifying co-hyponym lexical units in a terminological resource. The principles of term recognition and distributional semantics are combined to extract terms from a similar category of concept. Given a set of candidate terms, random projections are employed to represent them as low-dimensional vectors. These vectors are derived automatically from the frequency of the co-occurrences of the candidate terms and words that appear within windows of text in their proximity (context-windows). In a k-nearest neighbours framework, these vectors are classified using a small set of manually annotated terms which exemplify concept categories. We then investigate the interplay between the size of the corpus that is used for collecting the co-occurrences and a number of factors that play roles in the performance of the proposed method: the configuration of context-windows for collecting co-occurrences, the selection of neighbourhood size (k), and the choice of similarity metric.

机译：我们提出并评估一种用于识别术语资源中的同义词词汇单位的方法。术语识别和分布语义的原理被组合以从相似的概念类别中提取术语。给定一组候选项，采用随机投影将其表示为低维向量。这些向量是根据出现在其邻近的文本窗口（context-windows）中的候选词和单词的共现频率自动得出的。在k最近邻框架中，使用一小组手动注释的术语对这些向量进行分类，这些术语举例说明了概念类别。然后，我们研究了用于收集共现的语料库大小与在所提出的方法的性能中起作用的许多因素之间的相互影响：用于收集共现的上下文窗口的配置，选择邻域大小（k）以及相似性度量的选择。

著录项

来源
《International workshop on computational terminology》|2016年|62-72|共11页
会议地点
作者
Behrang Q. Zadeh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Precipitation classification at mid-latitudes in terms of drop size distribution parameters [J] . C.Caracciolo, F.Porcù, F.Prodi Advances in Geosciences . 2008,第16期

机译：根据液滴尺寸分布参数对中纬度地区的降水分类
2. Corpus domain effects on distributional semantic modeling of medical terms [J] . Bioinformatics . 2016,第23期

机译：语料库域对医学术语分布语义建模的影响
3. EXPERIMENTAL STUDY OF PHYSICAL PROPERTIES OF MAGNETIC FLUID IN TERMS OF SIZE DISTRIBUTION PARAMETERS BY SAMPLE MEAN MONTE CARLO METHOD [J] . DR. JITENDRA BINWAL International journal of multidisciplina . 2013,第1期

机译：样本均值蒙特卡罗方法对尺寸分布参数中磁流体物理性质的实验研究
4. A Study on the Interplay Between the Corpus Size and Parameters of a Distributional Model for Term Classification [C] . Behrang Q. Zadeh International workshop on computational terminology . 2016

机译：分类分类分布模型的语料库大小与参数的相互作用研究
5. Evolution of soot size distribution during soot formation and soot oxidation-fragmentation in premixed flames: Experimental and modeling study. [D] . Echavarria, Carlos Andres. 2010

机译：预混火焰中烟尘形成和烟尘氧化破碎过程中烟尘尺寸分布的演变：实验和模型研究。
6. Corpus domain effects on distributional semantic modeling of medical terms [O] . Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, -1

机译：语料库域对医学术语分布语义建模的影响
7. Precipitation classification at mid-latitudes in terms of drop size distribution parameters [O] . Caracciolo C., Porcù F., Prodi F. 2008

机译：根据液滴尺寸分布参数对中纬度地区的降水分类
8. Early Guidance for Assigning Distribution Parameters to Geochemical Input Terms to Stochastic Transport Models [R] . Kaplan, D. I., Millings, M. R. 2006

机译：将分布参数分配给地球化学输入项的随机运输模型的早期指导

A Study on the Interplay Between the Corpus Size and Parameters of a Distributional Model for Term Classification

摘要

著录项

相似文献

相关主题

期刊订阅