Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling

Chen Xin

首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling

【24h】

Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling

机译：通过概率主题建模开发基因组数据的功能和分类结构

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we present a method that enable both homology-based approach and composition-based approach to further study the functional core (i.e., microbial core and gene core, correspondingly). In the proposed method, the identification of major functionality groups is achieved by generative topic modeling, which is able to extract useful information from unlabeled data. We first show that generative topic model can be used to model the taxon abundance information obtained by homology-based approach and study the microbial core. The model considers each sample as a "document,ȁD; which has a mixture of functional groups, while each functional group (also known as a "latent topicȁD;) is a weight mixture of species. Therefore, estimating the generative topic model for taxon abundance data will uncover the distribution over latent functions (latent topic) in each sample. Second, we show that, generative topic model can also be used to study the genome-level composition of "N-merȁD; features (DNA subreads obtained by composition-based approaches). The model consider each genome as a mixture of latten genetic patterns (latent topics), while each functional pattern is a weighted mixture of the "N-merȁD; features, thus the existence of core genomes can be indicated by a set of common N-mer features. After studying the mutual information between latent topics and gene regions, we provide an explanation of the functional roles of uncovered latten genetic patterns. The experimental results demonstrate the effectiveness of proposed method.

机译：在本文中，我们提出了一种方法，该方法可同时启用基于同源性的方法和基于成分的方法，以进一步研究功能核心（即微生物核心和基因核心）。在提出的方法中，主要功能组的识别是通过生成主题建模实现的，该主题建模能够从未标记的数据中提取有用的信息。我们首先表明，生成主题模型可用于对通过基于同源性的方法获得的分类单元丰度信息进行建模，并研究微生物核心。该模型将每个样本视为一个“文档”，其中包含官能团的混合物，而每个官能团（也称为“潜在主题”； D）是物种的重量混合物。因此，估计分类生物丰度数据的生成主题模型将揭示每个样本中潜在函数（潜在主题）的分布。其次，我们表明，生成主题模型还可以用于研究“N-merȁD”的基因组水平组成；特征（通过基于组成的方法获得的DNA亚读）。该模型将每个基因组视为混杂的遗传基因模式（潜在主题），而每个功能模式都是“N-merȁD”的加权混合；因此，核心基因组的存在可以通过一组常见的N-mer特征来表明。在研究了潜在主题与基因区域之间的相互信息之后，我们对未发现的基因遗传模式的功能作用进行了解释。实验结果证明了该方法的有效性。

著录项

来源
《Computational Biology and Bioinformatics, IEEE/ACM Transactions on》 |2012年第4期|p.980-991|共12页
作者
Chen Xin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Exploiting probabilistic topic models to improve text categorization under class imbalance [J] . Enhong Chen, Yanggang Lin, Hui Xiong, Information Processing & Management . 2011,第2期

机译：利用概率主题模型改善班级不平衡下的文本分类
2. Probabilistic topic modeling for the analysis and classification of genomic sequences [J] . Massimo La Rosa, Antonino Fiannaca, Riccardo Rizzo, BMC Bioinformatics . 2015,第SUPPLEMENTa6期

机译：用于基因组序列分析和分类的概率主题建模
3. Modeling method of internet public information data mining based on probabilistic topic model [J] . Wu Shaofei, Liu Jun, Liu Lizhi Journal of supercomputing . 2019,第9期

机译：基于概率主题模型的互联网公共信息数据挖掘建模方法
4. Probabilistic topic modeling for genomic data interpretation [C] . 2010 IEEE International Conference on Bioinformatics and Biomedicine . 2010

机译：用于基因组数据解释的概率主题建模
5. Exploiting non-redundant local patterns and probabilistic models for analyzing structured and semi-structured data. [D] . Wang, Chao. 2008

机译：利用非冗余本地模式和概率模型来分析结构化和半结构化数据。
6. Probabilistic topic modeling for the analysis and classification of genomic sequences [O] . Massimo La Rosa, Antonino Fiannaca, Riccardo Rizzo, 2015

机译：用于基因组序列分析和分类的概率主题建模
7. Probabilistic topic modeling for the analysis and classification of genomic sequences [O] . Massimo La Rosa, Antonino Fiannaca, Riccardo Rizzo, 2015

机译：用于基因组序列分析和分类的概率主题建模

Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic Topic Modeling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅