首页> 外文会议>Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies >Investigating Machine Learning Methods for Language and Dialect Identification of Cuneiform Texts
【24h】

Investigating Machine Learning Methods for Language and Dialect Identification of Cuneiform Texts

机译:调查机器学习方法楔形文字文本的语言和方言识别

获取原文

摘要

Identification of the languages written using cuneiform symbols is a difficult task due to the lack of resources and the problem of tokeniza-tion. The Cuneiform Language Identification task in VarDial 2019 addresses the problem of identifying seven languages and dialects written in cuneiform; Sumerian and six dialects of Akkadian language: Old Babylonian, Middle Babylonian Peripheral, Standard Babylonian, Neo-Babylonian, Late Babylonian, and Neo-Assyrian. This paper describes the approaches taken by SharifCL team to this problem in VarDial 2019. The best result belongs to an ensemble of Support Vector Machines and a naive Bayes classifier, both working on character-level features, with macro-averaged F_1 -score of 72.10%.
机译:由于缺乏资源和令牌问题,使用楔形状符号编写的语言的识别是一项艰巨的任务。在Vardial 2019中的楔形语语言识别任务解决了识别七种语言和用楔形的方言的问题;哈美丽安和六方面的赤褐色语言:旧巴比伦,中间巴比伦外围,标准巴比伦,新巴比伦,晚巴巴比伦和新亚述。本文介绍了Sharifcl团队在Vardial 2019中采取的方法。最佳结果属于支持向量机和天真贝叶斯分类器的集合,无论是在字符级功能上,均为72.10的宏观平均为-core %。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号