首页> 外文期刊>BMC Genomics >GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data
【24h】

GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data

机译:GraphTeams:一种在Hi-C测序数据中发现空间基因簇的方法

获取原文
           

摘要

Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
机译:Hi-C测序提供了新颖,经济高效的方法来研究染色体的空间构象。我们使用从Hi-C实验获得的数据为空间基因簇的存在提供新的证据。这些是具有相关功能的基因集,这些基因在几个相关物种的染色体空间构象中表现出彼此紧密接近。我们提出了第一个能够处理空间数据的基因簇模型。我们的模型概括了一种流行的计算模型,用于从序列到图的基因簇预测,称为δ-teams。遵循先前的研究思路,我们随后扩展了模型,以允许将多个顶点与同一个标签关联。该模型称为δ族家族模型,特别适合我们的应用,因为它可以处理基因重复项。我们为这两种模型开发算法解决方案。我们实施了用于发现具有家族的δ-团队的算法,并将其集成到用于在Hi-C数据中发现基因簇的全自动工作流程中,称为GraphTeams。我们将其应用于人类和小鼠数据,以发现染色体内和染色体间的基因簇候选对象。结果包括染色体内簇,似乎在空间上比在其染色体DNA序列上更接近。我们进一步发现了染色体间基因簇,其中包含人类基因组内不同染色体的基因,但位于小鼠的单个染色体上。通过识别具有家族的δ团队,我们提供了一种灵活的模型来发现Hi-C数据中的基因簇候选物。我们对来自人类和小鼠的Hi-C数据的分析揭示了几个已知的基因簇(从而验证了我们的方法),但也很少有稀疏研究的或可能未知的候选基因簇,它们可能会成为进一步实验研究的来源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号