首页> 外文期刊>IEEE transactions on evolutionary computation >A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database
【24h】

A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

机译:结构数据库中基因注释的多目标进化概念聚类方法:以基因本体数据库为例

获取原文
获取原文并翻译 | 示例
       

摘要

Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical conceptual clustering approaches normally recover the most obvious relations, but fail to discover the less frequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), relying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in structural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.
机译:当前专用于检查大型数据库内容的工具和技术通常因无法支持基于对他们的用户有意义的条件的搜索而受到阻碍。这些缺点在存储结构数据(例如生物网络)表示形式的数据库中尤为明显。事实证明,概念聚类技术适用于揭示表征结构数据中对象的要素之间的关系。但是,典型的概念聚类方法通常可以恢复最明显的关系,但无法发现频率较低但信息量更大的基础数据关联。进化算法与多目标和多模式优化技术的结合构成了解决此问题的合适工具。我们提出了一种新的概念聚类方法,称为进化多目标概念聚类(EMO-CC),它依赖于NSGA-II多目标(MO)遗传算法。我们应用这种方法来识别从基因本体生成的结构数据库中的概念模型。这些模型可以解释和预测免疫炎症反应问题中的表型,类似于基因表达或其他遗传标记所提供的表型。对这些结果的分析表明,我们的方法发现了凝聚的集群,即使是那些包含由几个特征解释的少量观察结果的凝聚簇,也可以从不同的角度和不同的细节水平描述对象及其相互作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号