首页> 外文会议>International conference on text, speech, and dialogue >Multiword Expressions and Idiomaticity: How Much of the Sailing Has Been Plain?
【24h】

Multiword Expressions and Idiomaticity: How Much of the Sailing Has Been Plain?

机译:多词表达和惯用语:多少航行简单?

获取原文

摘要

Much progress has been made in designing accurate word representations [2—4], with improvements for language technology applications like machine translation and text simplification. Precise natural language understanding requires adequate treatments both of single words and of larger units. In particular, one commonly held assumption for constructing the representation for larger units like expressions, phrases and sentences, is that the meaning of the unit can be constructed from the meanings of its parts, in what is known as the Compositionality Principle. While it allows an interpretation to be generated even for unseen combinations of known words, it may not be adequate for expressions like idioms, verb-particle constructions and compound nouns as they often display idiomaticity. For instance, this is the case of loan shark with the meaning of a person who lends money at extremely high interest rates (rather than a fish that can be borrowed). Therefore it is important to identify which words in a sentence form an expression [5], and whether an expression is idiomatic [1,6] and should be treated as a unit, as this determines if it can be interpreted from a combination of the meanings of their component words or not. In this talk I discuss advances on the identification and treatment of multiword expressions in texts, focusing in particular on techniques for modelling idiomaticity.
机译:在设计准确的单词表示[2-4]方面已经取得了很大的进步,并且对语言技术应用(例如机器翻译和文本简化)进行了改进。准确的自然语言理解要求对单个单词和较大单位都进行适当处理。特别是,为表达,短语和句子之类的较大单元构建表示形式的一种普遍认为的假设是,单元的含义可以从其组成部分的含义中构造出来,即所谓的“组成原则”。尽管即使对于已知单词的看不见组合,它也可以生成解释,但对于诸如成语,动词-粒子结构和复合名词之类的表达,它们可能经常表现出惯用法,因此可能不够用。例如,“高利贷”就是这种情况,意思是一个人以极高的利率借钱(而不是可以借来的鱼)。因此,重要的是要确定一个句子中的哪些单词构成了表达式[5],以及该表达式是否是惯用语[1,6]并应将其视为一个单元,因为这确定了是否可以通过组合使用来解释它是否包含其组成词的含义。在本演讲中,我将讨论文本中多词表达的识别和处理方面的进展,特别是针对惯用性建模技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号