首页> 美国卫生研究院文献>Computational and Mathematical Methods in Medicine >Feature Engineering for Drug Name Recognition in Biomedical Texts: Feature Conjunction and Feature Selection
【2h】

Feature Engineering for Drug Name Recognition in Biomedical Texts: Feature Conjunction and Feature Selection

机译:生物医学文献中药物名称识别的特征工程:特征结合和特征选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Drug name recognition (DNR) is a critical step for drug information extraction. Machine learning-based methods have been widely used for DNR with various types of features such as part-of-speech, word shape, and dictionary feature. Features used in current machine learning-based methods are usually singleton features which may be due to explosive features and a large number of noisy features when singleton features are combined into conjunction features. However, singleton features that can only capture one linguistic characteristic of a word are not sufficient to describe the information for DNR when multiple characteristics should be considered. In this study, we explore feature conjunction and feature selection for DNR, which have never been reported. We intuitively select 8 types of singleton features and combine them into conjunction features in two ways. Then, Chi-square, mutual information, and information gain are used to mine effective features. Experimental results show that feature conjunction and feature selection can improve the performance of the DNR system with a moderate number of features and our DNR system significantly outperforms the best system in the DDIExtraction 2013 challenge.
机译:药品名称识别(DNR)是提取药品信息的关键步骤。基于机器学习的方法已广泛用于DNR,具有各种类型的功能,例如词性,词形和字典功能。当前基于机器学习的方法中使用的特征通常是单例特征,这可能是由于将单例特征合并为联合特征时爆炸性特征和大量嘈杂特征所致。但是,当应考虑多种特征时,仅能捕获单词的一种语言特征的单例特征不足以描述DNR的信息。在这项研究中,我们探索了DNR的特征合取和特征选择,但从未有过报道。我们直观地选择8种类型的单例特征,并通过两种方式将它们组合为联合特征。然后,使用卡方,互信息和信息增益来挖掘有效特征。实验结果表明,特征结合和特征选择可以改善具有中等数量特征的DNR系统的性能,而我们的DNR系统明显优于DDIExtraction 2013挑战赛中的最佳系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号