首页> 外文会议>International workshop on semantic evaluation >TALN at SemEval-2016 Task 11: Modelling Complex Words by Contextual, Lexical and Semantic Features
【24h】

TALN at SemEval-2016 Task 11: Modelling Complex Words by Contextual, Lexical and Semantic Features

机译:Semeval-2016 Task 11:通过上下文,词汇和语义功能建模复杂词

获取原文

摘要

This paper presents the participation of the TALN team in the Complex Word Identification Task of SemEval-2016 (Task 11). The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. To experiment with word complexity identification approaches, Task organizers provided a training set of 2,237 words judged as complex or not by 20 human evaluators, together with the sentence in which each word occurs. In our contribution we modelled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features that may help assess the complexity of a word. We trained a Random Forest classifier to automatically decide if each word is complex or not. We submitted two runs in which we respectively considered unweighted and weighted instances of complex words to train our classifier, where the weight of each instance is proportional to the number of evaluators that judged the word as complex. Our system scored as the third best performing one.
机译:本文介绍了Taln团队在Semeval-2016的复杂词识别任务中的参与(任务11)。任务的目的是确定给定句子中的单词是否可以被某个目标受众判断为复杂。为了实验文字复杂性识别方法,任务组织者提供了2,237个单词的培训组,判断为复杂的复杂,以及20名人类评估人员,以及每个单词发生的句子。在我们的贡献中,我们为每个单词建模以评估为具有一组词汇,语义和上下文功能的数字向量,这些传感器可能有助于评估单词的复杂性。我们培训了随机林分类器,自动确定每个单词是否复杂。我们提交了两次运行,其中我们分别考虑了复杂单词的未加权和加权实例,以培训我们的分类器,每个实例的权重与判断单词的评估人数成比例。我们的系统得分为第三次表演。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号