首页> 外文会议>Selected topics in applied computer science >Sequence Mining in DNA chips data for Diagnosing Cancer Patients
【24h】

Sequence Mining in DNA chips data for Diagnosing Cancer Patients

机译:DNA芯片数据中的序列挖掘可诊断癌症患者

获取原文
获取原文并翻译 | 示例

摘要

Deoxyribonucleic acid (DNA) micro-arrays present a powerful means of observing thousands of gene terms levels at the same time. They consist of high dimensional datasets, which challenge conventional clustering methods. The data's high dimensionality calls for Self Organizing Maps (SOMs) to cluster DNA micro-array data. The DNA micro-array dataset are stored in huge biological databases for several purposes [1]. The proposed methods are based on the idea of selecting a gene subset to distinguish all classes, it will be more effective to solve a multi-class problem, and we will propose a genetic programming (GP) based approach to analyze multi-class micro-array datasets. This biological dataset will be derived from multiple biological databases. The procedure responsible for extracting datasets called DNA-Aggregator. We will design a biological aggregator, which aggregates various datasets via DNA micro-array community-developed ontology based upon the concept of semantic Web for integrating and exchanging biological data. Our aggregator is composed of modules that retrieve the data from various biological databases. It will also enable queries by other applications to recognize the genes. The genes will be categorized in groups based on a classification method, which collects similar expression patterns. Using a clustering method such as k-mean is required either to discover the groups of similar objects from the biological database to characterize the underlying data distribution.
机译:脱氧核糖核酸(DNA)微阵列提供了一种强大的手段,可以同时观察数千个基因术语水平。它们由高维数据集组成,这挑战了传统的聚类方法。数据的高维度要求使用自组织图(SOM)对DNA微阵列数据进行聚类。 DNA微阵列数据集出于多种目的被存储在巨大的生物学数据库中[1]。所提出的方法基于选择基因子集以区分所有类别的思想,这将更有效地解决多类别问题,并且我们将提出一种基于遗传程序设计(GP)的方法来分析多类别微型计算机。数组数据集。该生物学数据集将来自多个生物学数据库。负责提取数据集的过程称为DNA-Aggregator。我们将设计一个生物聚合器,该生物聚合器基于语义Web的概念通过DNA微阵列社区开发的本体聚合各种数据集,以集成和交换生物数据。我们的聚合器由从各种生物学数据库中检索数据的模块组成。它还将允许其他应用程序查询以识别基因。这些基因将基于分类方法分类,该分类方法收集相似的表达模式。需要使用诸如k-mean的聚类方法从生物数据库中发现相似对象的组,以表征基础数据分布。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号