首页> 外文会议>International Multidisciplinary Information Technology and Engineering Conference >A Combination Part of Speech Tagger using Selected Voting Methods
【24h】

A Combination Part of Speech Tagger using Selected Voting Methods

机译:使用所选投票方法的语音标记器的组合部分

获取原文

摘要

The development of resources in any language is an expensive process, many languages, including the indigenous languages of South Africa, can be classified as being resource scarce, or lacking in tagging resources. This study investigates and applies techniques and methodologies for optimising the use of available resources and improving the accuracy of a tagger using Afrikaans as resource-scarce language and aims to determine whether combination techniques can be effectively applied to improve the accuracy of a tagger for Afrikaans. In order to do this, existing methodologies for combining classification algorithms are investigated. Four taggers, trained using MBT, SVM1ight, MXPOST and TnT respectively, are then combined into a combination tagger using weighted voting. Weights are calculated by means of total precision, tag precision and a combination of precision and recall. Although the combination of taggers does not consistently lead to an error rate reduction with regard to the baseline, it manages to achieve an error rate reduction of up to 14.54% in some cases.
机译:任何语言的资源的发展都是一个昂贵的过程,许多语言,包括南非的土着语言,可以被归类为资源稀缺,或缺乏标记资源。本研究调查和应用技术和方法,以优化可用资源的使用,并使用南非甘肃作为资源稀缺语言来提高标签的准确性,并旨在确定是否可以有效地应用组合技术以提高南非荷兰标签的准确性。为此,研究了用于组合分类算法的现有方法。四个标签,使用MBT,SVM培训 1ight 然后,分别将MXPOST和TNT分别组合到使用加权投票的组合标签中。重量通过总精度,标签精度和精度和召回的组合来计算。虽然标签器的组合并不一致导致关于基线的错误率降低,但在某些情况下,它可以在某些情况下达到最高可达14.54%的错误率降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号