首页> 美国卫生研究院文献>BMC Bioinformatics >A study on the application of topic models to motif finding algorithms
【2h】

A study on the application of topic models to motif finding algorithms

机译:主题模型在主题发现算法中的应用研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundTopic models are statistical algorithms which try to discover the structure of a set of documents according to the abstract topics contained in them. Here we try to apply this approach to the discovery of the structure of the transcription factor binding sites (TFBS) contained in a set of biological sequences, which is a fundamental problem in molecular biology research for the understanding of transcriptional regulation. Here we present two methods that make use of topic models for motif finding. First, we developed an algorithm in which first a set of biological sequences are treated as text documents, and the k-mers contained in them as words, to then build a correlated topic model (CTM) and iteratively reduce its perplexity. We also used the perplexity measurement of CTMs to improve our previous algorithm based on a genetic algorithm and several statistical coefficients.
机译:BackgroundTopic模型是统计算法,试图根据其中包含的抽象主题来发现一组文档的结构。在这里,我们尝试将这种方法应用于发现一组生物序列中所包含的转录因子结合位点(TFBS)的结构,这是分子生物学研究中对转录调控理解的一个基本问题。在这里,我们介绍两种利用主题模型进行主题查找的方法。首先,我们开发了一种算法,其中首先将一组生物序列视为文本文档,并将其中包含的k-mers视为单词,然后构建相关主题模型(CTM)并迭代减少其困惑。我们还使用CTM的困惑度测量来改进我们先前基于遗传算法和几个统计系数的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号