...
首页> 外文期刊>Bioinformatics >Improving performances of suboptimal greedy iterative biclustering heuristics via localization
【24h】

Improving performances of suboptimal greedy iterative biclustering heuristics via localization

机译:通过本地化改善次优贪婪迭代双聚类启发式算法的性能

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function.Methods: We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters.Results: We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method.
机译:动机:使基因表达数据成簇是提取基因和条件子矩阵的问题,这些子矩阵在表达值数据矩阵的行和列之间都表现出显着的相关性。即使是最简单的问题版本也很难计算。因此,大多数建议的解决方案都采用贪婪迭代启发式算法,以局部优化适当分配的评分函数。方法:我们提供了一种称为Localization的快速,简单的预处理算法,该算法对输入数据矩阵的行和列进行重新排序,从而在矩阵内的小局部邻域中对相关条目进行分组。所提出的定位算法源于有效利用图论方法,该方法适用于表现出与双聚类结构相似的问题。为了评估本地化预处理算法的有效性,我们集中在三种代表性的贪婪迭代启发式方法上。我们展示了如何将本地化预处理合并到每个代表性算法中,以提高双聚类性能。此外,我们提出了一种简单的双聚类算法,即本地化后随机提取(REAL),该算法从本地化预处理数据矩阵中随机抽取子矩阵,消除那些相似性得分较低的子矩阵,并将其余的作为关联结构表示双聚类。建议的本地化预处理以及另一种预处理替代方案,即非负矩阵分解。我们证明,就H值测试而言,我们的快速,简单的本地化过程所提供的结果比计算繁重的矩阵分解预处理过程更相似甚至更好。接下来,我们证明当功能性浓缩和PPI验证形式的生物学相关构成主要性能标准时,三种代表性贪婪迭代启发式方法的性能会随着本地化预处理而提高。在相同条件下,基于局部REAL的随机提取方法的性能优于代表性的贪婪启发式方法,这也证实了所建议的预处理方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号