首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Using Sequence Similarity Networks to Identify Partial Cognates in Multilingual Wordlists
【24h】

Using Sequence Similarity Networks to Identify Partial Cognates in Multilingual Wordlists

机译:使用序列相似性网络识别多语言单词列表中的部分认知

获取原文

摘要

Increasing amounts of digital data in historical linguistics necessitate the development of automatic methods for the detection of cognate words across languages. Recently developed methods work well on language families with moderate time depths, but they are not capable of identifying cognate morphemes in words which are only partially related. Partial cog-nacy, however, is a frequently recurring phenomenon, especially in language families with productive derivational morphology. This paper presents a pilot approach for partial cognate detection in which networks are used to represent similarities between word parts and cognate morphemes are identified with help of state-of-the-art algorithms for network partitioning. The approach is tested on a newly created benchmark dataset with data from three sub-branches of Sino-Tibetan and yields very promising results, outperforming all algorithms which are not sensible to partial cognacy.
机译:历史语言学中越来越多的数字数据需要开发用于跨语言检测同源词的自动方法。最近开发的方法在具有中等时间深度的语言家庭上工作,但它们不能识别仅部分相关的词语的同位形态。然而,部分齿轮NACY是一种经常经常性的现象,特别是在具有生产衍生形态的语言系列中。本文介绍了用于部分同源检测的导频方法,其中用于代表单词零件和同源语素之间的相似性,以帮助网络分区的最先进的算法来识别。该方法在新创建的基准数据集上进行测试,其中包含来自Sino-intibetan的三个子分支的数据,并产生非常有前途的结果,优于所有对部分认知不明智的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号