...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering
【24h】

Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering

机译:利用从基因本体论中提取生物学知识的多因素基因-基因邻近度测量方法:在基因聚类中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

To describe the cellular functions of proteins and genes, a potential dynamic vocabulary is Gene Ontology (GO), which comprises of three sub-ontologies namely, Biological-process, Cellular-component, and Molecular-function. It has several applications in the field of bioinformatics like annotating/measuring gene-gene or protein-protein semantic similarity, identifying genes/proteins by their GO annotations for disease gene and target discovery, etc. To determine semantic similarity between genes, several semantic measures have been proposed in literature, which involve information content of GO-terms, GO tree structure, or the combination of both. But, most of the existing semantic similarity measures do not consider different topological and information theoretic aspects of GO-terms collectively. Inspired by this fact, in this article, we have first proposed three novel semantic similarity/distance measures for genes covering different aspects of GO-tree. These are further implanted in the frameworks of well-known multi-objective and single-objective based clustering algorithms to determine functionally similar genes. For comparative analysis, 10 popular existing GO based semantic similarity/distance measures and tools are also considered. Experimental results on Mouse genome, Yeast, and Human genome datasets evidently demonstrate the supremacy of multi-objective clustering algorithms in association with proposed multi-factored similarity/distance measures. Clustering outcomes are further validated by conducting some biological/statistical significance tests. Supplementary information is available at https://www.iitp.ac.in/sriparna/journals.html.
机译:为了描述蛋白质和基因的细胞功能,潜在的动态词汇是基因本体论(GO),它由三个亚本体论组成,即生物过程,细胞成分和分子功能。它在生物信息学领域有多种应用,例如注释/测量基因-基因或蛋白质-蛋白质的语义相似性,通过它们对疾病基因的GO注释和目标发现来识别基因/蛋白质等。要确定基因之间的语义相似性,需要采取几种语义措施在文献中已经提出了涉及GO术语,GO树结构或两者的组合的信息内容。但是,大多数现有的语义相似性度量并没有共同考虑GO术语的不同拓扑和信息理论方面。受这一事实的启发,在本文中,我们首先针对覆盖GO树不同方面的基因提出了三种新颖的语义相似度/距离度量。这些被进一步植入众所周知的基于多目标和单目标的聚类算法的框架中,以确定功能相似的基因。为了进行比较分析,还考虑了10种流行的现有基于GO的语义相似度/距离度量和工具。在小鼠基因组,酵母和人类基因组数据集上的实验结果显然证明了多目标聚类算法与拟议的多因素相似性/距离测量方法相辅相成的优势。通过进行一些生物学/统计显着性检验,进一步验证了聚类结果。有关补充信息,请访问https://www.iitp.ac.in/sriparna/journals.html。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号