首页> 外文期刊>Information Fusion >Simulated annealing based classifier ensemble techniques: Application to part of speech tagging
【24h】

Simulated annealing based classifier ensemble techniques: Application to part of speech tagging

机译:基于模拟退火的分类器集成技术:在部分语音标签中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Part-of-Speech (PoS) tagging is an important pipelined module for almost all Natural Language Processing (NLP) application areas. In this paper we formulate PoS tagging within the frameworks of single and multi-objective optimization techniques. At the very first step we propose a classifier ensemble technique for PoS tagging using the concept of single objective optimization (SOO) that exploits the search capability of simulated annealing (SA). Thereafter we devise a method based on multiobjective optimization (MOO) to solve the same problem, and for this a recently developed multiobjective simulated annealing based technique, AMOSA, is used. The characteristic features of AMOSA are its concepts of the amount of domination and archive in simulated annealing, and situation specific acceptance probabilities. We use Conditional Random Field (CRF) and Support Vector Machine (SVM) as the underlying classification methods that make use of a diverse set of features, mostly based on local contexts and orthographic constructs. We evaluate our proposed approaches for two Indian languages, namely Bengali and Hindi. Evaluation results of the single objective version shows the overall accuracy of 88.92% for Bengali and 87.67% for Hindi. The MOO based ensemble yields the overall accuracies of 90.45% and 89.88% for Bengali and Hindi, respectively.
机译:词性(PoS)标记是几乎所有自然语言处理(NLP)应用领域中重要的流水线模块。在本文中,我们在单目标和多目标优化技术的框架内制定了PoS标签。在第一步中,我们提出了使用单目标优化(SOO)概念的PoS标记分类器集成技术,该技术利用了模拟退火(SA)的搜索功能。此后,我们设计了一种基于多目标优化(MOO)的方法来解决相同的问题,为此,使用了最近开发的基于多目标模拟退火的技术AMOSA。 AMOSA的特征是其模拟退火中的控制量和归档量的概念,以及特定于情况的验收概率。我们使用条件随机场(CRF)和支持向量机(SVM)作为基础分类方法,这些方法利用了多种不同的功能,这些功能主要基于局部上下文和正字法构造。我们评估了针对两种印度语言(孟加拉语和印地语)的拟议方法。单一目标版本的评估结果显示,孟加拉语的整体准确性为88.92%,印地语为87.67%。基于MOO的合奏对孟加拉语和北印度语的整体准确度分别为90.45%和89.88%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号