首页> 外文会议>Language and Technology Conference >Binary Classification Algorithms for the Detection of Sparse Word Forms in New Indo-Aryan Languages
【24h】

Binary Classification Algorithms for the Detection of Sparse Word Forms in New Indo-Aryan Languages

机译:二进制分类算法,用于检测新印度 - 雅利安语言中的稀疏单词形式

获取原文

摘要

This paper describes experiments in applying statistical classification algorithms for the detection of converbs - rare word forms found in historical texts in New Indo-Aryan languages. The digitized texts were first manually tagged with the help of a custom made tool called IA Tagger enabling semi-automatic tagging of the texts. One of the features of the system is the generation of statistical data on occurrences of words and phrases in various contexts, which helps perform historical linguistic analysis at the levels of morphosyntax, semantics and pragmatics. The experiments carried out on data annotated with the use of IA Tagger involved the training of multi-class and binary POS-classifiers.
机译:本文介绍了应用统计分类算法的实验,以检测到新的印度艺术语言中历史文本中的历史文本中发现的罕见单词形式。首次手动标记数字化文本,其中包含称为IA标记器的自定义工具启用文本的半自动标记。该系统的特征之一是在各种情况下的单词和短语发生的统计数据的产生,这有助于在Morphosyntax,语义和语用学的水平上进行历史语言分析。通过使用IA标签注释的数据进行的实验涉及多级和二进制POS分类器的培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号