...
首页> 外文期刊>Computational linguistics >Identification of Multiword Expressions by Combining Multiple Linguistic Information Sources
【24h】

Identification of Multiword Expressions by Combining Multiple Linguistic Information Sources

机译:结合多种语言信息源识别多词表达

获取原文
   

获取外文期刊封面封底 >>

       

摘要

We propose a framework for using multiple sources of linguistic information in the task of identifying multiword expressions in natural language texts. We define various linguistically motivated classification features and introduce novel ways for computing them. We then manually define interrelationships among the features, and express them in a Bayesian network. The result is a powerful classifier that can identify multiword expressions of various types and multiple syntactic constructions in text corpora. Our methodology is unsupervised and language-independent; it requires relatively few language resources and is thus suitable for a large number of languages. We report results on English, French, and Hebrew, and demonstrate a significant improvement in identification accuracy, compared with less sophisticated baselines.
机译:我们提出了一个框架,用于在识别自然语言文本中的多词表达的任务中使用多种语言信息。我们定义了各种基于语言的分类特征,并介绍了计算它们的新颖方法。然后,我们手动定义要素之间的相互关系,并在贝叶斯网络中表达它们。结果是一个强大的分类器,可以识别文本语料库中各种类型的多词表达和多种句法结构。我们的方法不受监督且与语言无关;它需要相对较少的语言资源,因此适用于多种语言。我们报告了英语,法语和希伯来语的结果,并证明了与较不复杂的基准相比,识别准确性有了显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号