首页> 外文期刊>International journal of machine learning and cybernetics >Simultaneous feature and parameter selection using multiobjective optimization: application to named entity recognition
【24h】

Simultaneous feature and parameter selection using multiobjective optimization: application to named entity recognition

机译:使用多目标优化同时进行特征和参数选择:在命名实体识别中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose an efficient algorithm based on the concept of multiobjective optimization (MOO) for performing feature selection and parameter optimization of any machine learning technique. Feature and parameter combinations have significant effect to the accuracy of the classifier. We perform feature selection and parameter optimization for four different classifiers, namely conditional random field, support vector machine, memory based learner and maximum entropy. The proposed algorithms are evaluated for solving the problems of named entity recognition, an important component in many text processing applications. Currently we experiment with four different languages, namely Bengali, Hindi, Telugu and English. At first the proposed MOO based technique is used to determine the appropriate features and parameters. For each of the classifiers, the algorithm produces a set of solutions on the final Pareto optimal front. Each solution represents a classifier with a particular feature and parameter combination. All these solutions are thereafter combined using a MOO based classifier ensemble technique. Evaluation results show that the proposed approach attains the F-measure (harmonic mean of recall and precision) values of 90.48, 90.44, 78.71 and 88.68 % for Bengali, Hindi, Telugu and English, respectively. We also show that for all the experimental settings the proposed feature and parameter optimization technique performs reasonably better than the baseline systems, developed with random feature subsets. Comparisons with the existing works also show the efficacy of our proposed algorithm.
机译:在本文中,我们提出了一种基于多目标优化(MOO)概念的有效算法,用于执行任何机器学习技术的特征选择和参数优化。特征和参数组合对分类器的准确性有重要影响。我们对四个不同的分类器进行特征选择和参数优化,即条件随机场,支持向量机,基于记忆的学习器和最大熵。对提出的算法进行了评估,以解决命名实体识别的问题,命名实体识别是许多文本处理应用程序中的重要组成部分。目前,我们尝试使用四种不同的语言,分别是孟加拉语,印地语,泰卢固语和英语。首先,所提出的基于MOO的技术用于确定适当的特征和参数。对于每个分类器,该算法都会在最终的帕累托最优前沿上产生一组解。每个解决方案代表具有特定功能和参数组合的分类器。此后,所有这些解决方案都使用基于MOO的分类器集成技术进行组合。评估结果表明,对于孟加拉语,北印度语,泰卢固语和英语,该方法的F度量(召回率和精确度的谐和均值)分别达到90.48%,90.44、78.71和88.68%。我们还表明,对于所有实验设置,所提出的特征和参数优化技术的性能均比使用随机特征子集开发的基线系统更好。与现有工作的比较也表明了我们提出的算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号