首页> 外文会议>Computational linguistics and intelligent text processing >An Improved Automatic Term Recognition Method for Spanish
【24h】

An Improved Automatic Term Recognition Method for Spanish

机译:一种改进的西班牙语自动术语识别方法

获取原文
获取原文并翻译 | 示例

摘要

The C-value/NC-value algorithm, a hybrid approach to automatic term recognition, has been originally developed to extract multiword term candidates from specialised documents written in English. Here, we present three main modifications to this algorithm that affect how the obtained output is refined. The first modification aims to maximise the number of real terms in the list of candidates with a new approach for the stop-list application process. The second modification adapts the C-value calculation formula in order to consider single word terms. The third modification changes how the term candidates are grouped, exploiting a lemmatised version of the input corpus. Additionally, size of candidate's context window is variable. We also show the necessary linguistic modifications to apply this algorithm to the recognition of term candidates in Spanish.
机译:C值/ NC值算法是一种自动术语识别的混合方法,最初是为了从用英语编写的专用文档中提取多字词候选而开发的。在这里,我们提出了对该算法的三个主要修改,这些修改影响了如何精炼所获得的输出。第一种修改旨在通过一种新的停止列表申请流程的方法来最大化候选者列表中的实词数量。第二个修改采用C值计算公式,以考虑单个单词项。第三个修改是利用输入语料库的复数形式来更改术语候选者的分组方式。另外,候选人的上下文窗口的大小是可变的。我们还将显示必要的语言修改,以将该算法应用于西班牙语中的词语候选者识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号