首页> 外文期刊>BMC Bioinformatics >FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral
【24h】

FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral

机译:FISim:基于模糊积分的转录因子结合位点之间的新相似性度量

获取原文
获取外文期刊封面目录资料

摘要

Background Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources. Results We propose FISim, a new similarity measure between PFMs, based on the fuzzy integral of the distance of the nucleotides with respect to the information content of the positions. Unlike existing methods, FISim is designed to consider the higher contribution of better conserved positions to the binding affinity. FISim provides excellent results when dealing with sets of randomly generated motifs, and outperforms the remaining methods when handling real datasets of related motifs. Furthermore, we propose a new cluster methodology based on kernel theory together with FISim to obtain groups of related motifs potentially bound by the same TFs, providing more robust results than existing approaches. Conclusion FISim corrects a design flaw of the most popular methods, whose measures favour similarity of low information content positions. We use our measure to successfully identify motifs that describe binding sites for the same TF and to solve real-life problems. In this study the reliability of fuzzy technology for motif comparison tasks is proven.
机译:背景调控基序描述了相关转录因子结合位点(TFBS)的集合,并且可以表示为位置频率矩阵(PFM)。 TFBS的从头鉴定是计算生物学中的关键问题,其中包括将推定的基序相互比较以及与已知基序进行比较的问题。为了计算PFM相似性,应考虑PFM中给定位置内每个核苷酸的相对重要性。此外,生物学数据本质上是嘈杂且不精确的。模糊集理论特别适合于对不精确的数据进行建模,而模糊积分则非常适合于表示不同信息源之间的相互作用。结果我们基于核苷酸距离相对于位置信息内容的模糊积分,提出了PFM之间的一种新的相似性度量FISim。与现有方法不同,FISim旨在考虑更好的保守位置对结合亲和力的更高贡献。当处理随机生成的图案集时,FISim可提供出色的结果,并且在处理相关图案的真实数据集时,FISim的性能优于其他方法。此外,我们提出了一种基于核理论和FISim的新聚类方法,以获取可能由相同TF绑定的相关基序组,比现有方法提供更可靠的结果。结论FISim纠正了最流行方法的设计缺陷,该缺陷的措施有利于信息含量低的位置的相似性。我们使用我们的方法成功地识别出描述同一TF结合位点的基序,并解决现实生活中的问题。在这项研究中,证明了模糊技术用于主题比较任务的可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号