首页> 外文会议>International Conference in Advances in Electrical and Computer Technologies >Machine Learning Approach for Feature Interpretation and Classification of Genetic Mutations Leading to Tumor and Cancer
【24h】

Machine Learning Approach for Feature Interpretation and Classification of Genetic Mutations Leading to Tumor and Cancer

机译:肿瘤和癌症遗传突变特征解释和分类的机器学习方法

获取原文

摘要

As the interpretation of genetic mutation is done manually, it is difficult to diagnose a large number of patients and get reports of the same in a quick time. Hence, it needs to be automated using machine learning approach. Towards the same, natural language processing (NLP) technique, viz. term frequency-inverse document frequency (TF-IDF), is used to represent documents as fixed-size depiction for interpreting the given nine classes of genetic mutations. The main aim of this study is to identify the well-suited machine learning model which will give better results in terms of multi-class log-loss. Another important aspect of this study is to interpret the features since feature interpretability is very important in healthcare domain using various machine learning algorithms. Logistic regression (LR) with class balancing was implemented by taking top 1000 words of 3-gram TF-IDF generated features that outperformed the other classifiers to give a test log-loss of 0.98.
机译:随着对遗传突变的解释是手动进行的,难以诊断大量患者并在速度快速上获取同样的报道。因此,需要使用机器学习方法自动化。朝着相同,自然语言处理(NLP)技术,viz。术语频率逆文档频率(TF-IDF)用于将文档表示为用于解释给定九类遗传突变的固定尺寸描绘。本研究的主要目的是识别良好的机器学习模型,这将在多级数量损失方面提供更好的结果。本研究的另一个重要方面是解释这些功能,因为使用各种机器学习算法在医疗域中的特征解释性非常重要。通过占用3克TF-IDF生成功能的前1000个单词来实现具有类平衡的Logistic回归(LR),这些功能优于其他分类器,以提供0.98的测试记录损耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号