首页> 外文会议>International Symposium on Current Progress in Mathematics and Sciences >Multiclass Classification of Breast Cancer Large Scale Datasets for Detecting Cancer Drivers
【24h】

Multiclass Classification of Breast Cancer Large Scale Datasets for Detecting Cancer Drivers

机译:用于检测癌症司机的乳腺癌大规模数据集的多款分类

获取原文

摘要

Over the past decade, scientists have found that even healthy genes can cause cancer due to hormonal growth disorder. The pattern of multiclass classification on data mining has recently become an important topic for research, especially in the health sector. Classification of cancer cells also plays an important role in the development of almost all types of cancer, and in this case, we focus on breast cancer. Therefore, studying Multiclass Classification is crucial to the experts in diagnosing cancer. Since datasets on type of breast cancer cell are plenty, it is important to pay more attention to the method to be as efficient as it could be for we are going to process such large datasets. Based on big data technologies, this study proposes the feature selection step in high dimension data classification problem and datasets with dozens of features. Multiclass Classification supports a study to adopt big data solutions. This machine learning techniques analyze a breast mass by analyzing the digitized image of a fine needle aspirate (FNA) which describes characteristics of the cell nuclei present in breast cancer. From the datasets of various classifications of breast mass will be investigated further to determine their active role in cancer. Especially, based on this research aimed to identify and analyze the ability of Support Vector Machine (SVM) as a Classification method and Relief F -Based Feature Selection as a Selection Method for diagnosing breast cancer driver. This method could be an efficient method for cancer classification with the accurate performance of 91 %.
机译:在过去十年中,科学家们发现甚至健康的基因甚至可能因荷尔蒙生长障碍而导致癌症。数据挖掘的多牌分类模式最近成为研究的重要课题,特别是在卫生部门。癌细胞的分类也在几乎所有类型的癌症的发展中起着重要作用,在这种情况下,我们专注于乳腺癌。因此,研究多标量分类对于诊断癌症的专家来说至关重要。由于乳腺癌细胞类型的数据集是充足的,因此重要的是要更加关注那种有效的方法,因为我们将要处理如此大的数据集。基于大数据技术,本研究提出了高维数据分类问题的特征选择步骤和具有数十个功能的数据集。多字母分类支持采用大数据解决方案的研究。该机器学习技术通过分析描述乳腺癌中存在的细胞核的特征的细针吸气(FNA)的数字化图像来分析乳房质量。从各种分类的数据集将进一步调查乳房肿块,以确定其在癌症中的活跃作用。特别是,基于该研究的旨在识别和分析支持向量机(SVM)作为分类方法和浮雕F基团特征选择作为诊断乳腺癌驱动器的选择方法。这种方法可以是癌症分类的有效方法,精确性能为91%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号