首页> 外文会议>IEEE Congress on Evolutionary Computation >Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming
【24h】

Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming

机译:基于遗传程序的医学文本分类的数据驱动正则表达式演化

获取原文

摘要

In medical fields, text classification is one of the most important tasks that can significantly reduce human work-load through structured information digitization and intelligent decision support. Despite the popularity of learning-based text classification techniques, it is hard for human to understand or manually fine-tune the classification for better precision and recall, due to the black box nature of learning. This study proposes a novel regular expression-based text classification method making use of genetic programming (GP) approaches to evolve regular expressions that can classify a given medical text inquiry with satisfaction. Given a seed population of regular expressions (randomly initialized or manually constructed by experts), our method evolves a population of regular expressions, using a novel regular expression syntax and a series of carefully chosen reproduction operators. Our method is evaluated with real-life medical text inquiries from an online healthcare provider and shows promising performance. More importantly, our method generates classifiers that can be fully understood, checked and updated by medical doctors, which are fundamentally crucial for medical related practices.
机译:在医学领域,文本分类是最重要的任务之一,它可以通过结构化的信息数字化和智能决策支持来显着减少人类的工作量。尽管基于学习的文本分类技术广受欢迎,但是由于学习的黑盒性质,人们仍然难以理解或手动微调分类以实现更高的精度和召回率。这项研究提出了一种新颖的基于正则表达式的文本分类方法,该方法利用基因编程(GP)方法来发展正则表达式,可以满意地对给定的医学文本查询进行分类。给定正则表达式的种子种群(由专家随机初始化或手动构建),我们的方法使用新颖的正则表达式语法和一系列精心选择的复制运算符来演化正则表达式的种群。我们的方法通过在线医疗服务提供商的实时医疗文本查询进行了评估,并显示出令人鼓舞的性能。更重要的是,我们的方法生成的分类器可以被医生完全理解,检查和更新,这对于医学相关实践至关重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号