Contrast Learning on ChIP-Seq Data of Transcription Factors.

机译：转录因子ChIP-Seq数据的对比学习。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this study, we analyzed the TF ChIP-Seq data of 105 (i.e., 15 choose 2) pairs. Each pair is based on two TF and three binding-dependent (BD) sequence datasets. The BD were generated from the two TF ChIP-Seq datasets in each pair. That is, the three scenario datasets are containing TFBS sequences of type 1, 2 or both (i.e., 1 and 2) TF.;The objective is to identify motif 1, 2 or even both (i.e., interactive motifs) by contrasting two of the three BD datasets at a time by using the contrast-motif-finder (CMF) algorithm. Each of the CMF's output not only provides estimated consensus motifs based on its full name PWM but also provides likelihood ratios (LRs) as a measure of the enrichment of an identified motif. Using this idea, we construct a dataset where the first column lists the locations of identified enriched motif in the genome, column 2 to n+1 contains the estimated consensus motifs and the last column shows a binary (i.e., 0/1) of which set it is from and n is the number of consensus motifs.;Once these datasets are obtained, we use statistical model such as logistics regression, support vector machine (SVM) and classification tree models to determine their performance (i.e., error rates) and selection power. We have shownthat the SVM Radial kernel seems to have the best performance when using all the motifs in the dataset whereas classification tree selects the fewest motifs in almost every analyzed datasets but at the same time, the error rates and selection power do not drop as much. As a result, we believe the classification tree model is a better model since it not only provides a competitive predictive power with simpler models but also takes far less computational time than the other two models.

机译：在这项研究中，我们分析了105对（即15个选择2对）的TF ChIP-Seq数据。每对基于两个TF和三个绑定依赖（BD）序列数据集。 BD是从每对中的两个TF ChIP-Seq数据集中生成的。也就是说，这三个方案数据集包含类型为1、2或两者（即1和2）TF的TFBS序列；目标是通过对比两个主题中的两个，即主题1、2或什至两者（即交互式主题）。通过使用对比图元查找器（CMF）算法，一次可以获取三个BD数据集。 CMF的每个输出不仅基于其全名PWM提供了估计的共有图案，而且还提供了似然比（LRs）作为已识别图案丰富度的度量。使用此思想，我们构建了一个数据集，其中第一列列出了基因组中已鉴定的丰富基序的位置，第2到n + 1列包含了估计的共有基序，最后一列显示了其中的二进制值（即0/1）一旦获得这些数据集，我们就使用统计模型（例如物流回归，支持向量机（SVM）和分类树模型）确定其性能（即错误率），并选择力。我们已经表明，当使用数据集中的所有模体时，SVM径向核似乎具有最佳性能，而分类树在几乎每个分析的数据集中选择最少的模体，但同时，错误率和选择力不会下降那么多。结果，我们认为分类树模型是更好的模型，因为它不仅提供了具有较简单模型的竞争性预测能力，而且还比其他两个模型花费了更少的计算时间。

著录项

作者
Lee, Yuju.;
展开▼
作者单位

University of California, Los Angeles.;

展开▼
授予单位 University of California, Los Angeles.;
学科 Bioinformatics.;Mathematics.;Computer science.
学位 M.S.
年度 2014
页码 73 p.
总页数 73
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Cell-type and transcription factor specific enrichment of transcriptional cofactor motifs in ENCODE ChIP-seq data [J] . Chin Lui Goi, Peter Little, Chao Xie BMC Genomics . 2013,第SUPPLEMENTa5期

机译：ENCODE ChIP-seq数据中转录辅因子基序的细胞类型和转录因子特异性富集
2. ChIPBase: a database for decoding the transcriptional regulation of long non-coding RNA and microRNA genes from ChIP-Seq data [J] . Hui Zhou, Jian-Hua Yang, Jun-Hao Li, Nucleic acids research . 2013,第D1期

机译：ChIPBase：一个数据库，用于从ChIP-Seq数据中解码长的非编码RNA和microRNA基因的转录调控
3. Application of topic models to a compendium of ChIP-Seq datasets uncovers recurrent transcriptional regulatory modules [J] . Yang Guodong, Ma Aiqun, Qin Zhaohui S., Bioinformatics . 2020,第8期

机译：主题模型在芯片-SEQ数据集的概要中应用复发转录监管模块
4. cTAP: A Machine Learning Framework for Predicting Target Genes of a Transcription Factor using a Cohort of Gene Expression Data Sets [C] . Honglin Wang, Pujan Joshi, Seung-Hyun Hong, IEEE International Conference on Bioinformatics and Biomedicine . 2020

机译：CTAP：使用基因表达数据集的群组预测转录因子的靶基因的机器学习框架
5. Statistical Analyses of Clustering Patterns of Transcription Factor-DNA Binding in ChIP-seq Data. [D] . Liu, Jun. 2014

机译：ChIP-seq数据中转录因子-DNA结合的聚类模式的统计分析。
6. Cell-type and transcription factor specific enrichment of transcriptional cofactor motifs in ENCODE ChIP-seq data [O] . Chin Lui Goi, Peter Little, Chao Xie 2013

机译：ENCODE ChIP-seq数据中转录辅因子基序的细胞类型和转录因子特异性富集
7. Cell-type and transcription factor specific enrichment of transcriptional cofactor motifs in ENCODE ChIP-seq data [O] . 2013

机译：ENCODE ChIP-seq数据中转录辅因子基序的细胞类型和转录因子特异性富集

Contrast Learning on ChIP-Seq Data of Transcription Factors.

摘要

著录项

相似文献

相关主题

期刊订阅