首页> 外文会议>Language and technology conference >Binary Classification Algorithms for the Detection of Sparse Word Forms in New Indo-Aryan Languages
【24h】

Binary Classification Algorithms for the Detection of Sparse Word Forms in New Indo-Aryan Languages

机译:新的印度-雅利雅语言中的稀疏单词形式的二进制分类算法

获取原文

摘要

This paper describes experiments in applying statistical classification algorithms for the detection of converbs - rare word forms found in historical texts in New Indo-Aryan languages. The digitized texts were first manually tagged with the help of a custom made tool called IA Tagger enabling semi-automatic tagging of the texts. One of the features of the system is the generation of statistical data on occurrences of words and phrases in various contexts, which helps perform historical linguistic analysis at the levels of morphosyntax, semantics and pragmatics. The experiments carried out on data annotated with the use of IA Tagger involved the training of multi-class and binary POS-classifiers.
机译:本文介绍了运用统计分类算法检测新的印支-雅利安语中历史文本中所发现的稀有单词形式-动词的实验。首先使用名为IA Tagger的定制工具对数字化的文本进行手动标记,该工具可对文本进行半自动标记。该系统的功能之一是生成有关各种上下文中单词和短语出现的统计数据,这有助于在形态句法,语义和语用层面进行历史语言分析。在使用IA Tagger注释的数据上进行的实验涉及对多类和二进制POS分类器的训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号