Text mining has drawn significant atten tion in recent past due to the rapid growth in biomedical and clinical records. Entity extraction is one of the fundamental com ponents for biomedical text mining. In this paper, we propose a novel approach of feature selection for entity extraction that exploits the concept of deep learning and Particle Swarm Optimization (PSO). The system utilizes word embedding features along with several other features extracted by studying the properties of the datasets. We obtain an interesting observation that compact word embedding features as de termined by PSO are more effective com pared to the entire word embedding fea ture set for entity extraction. The pro posed system is evaluated on three bench mark biomedical datasets such as GENIA, GENETAG and AiMed. The effective ness of the proposed approach is evident with significant performance gains over the baseline models as well as the other ex isting systems. We observe improvements of 7.86%, 5.27% and 7.25% F-measure points over the baseline models for GE NIA, GENETAG, and AiMed dataset re spectively.
展开▼