Towards Order-Preserving SubMatrix Search and Indexing

机译：朝向订单保留的Sublatrix搜索和索引

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Order-Preserving SubMatrix (OPSM) has been proved to be important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. Given an OPSM query based on row or column keywords, it is desirable to retrieve OPSMs quickly from a large gene expression dataset or OPSM data via indices. However, the time of OPSM mining from gene expression dataset is long and the volume of OPSM data is huge. In this paper, we investigate the issues of indexing two datasets above and first present a naive solution pfTree by applying prefix-Tree. Due to it is not efficient to search the tree, we give an optimization indexing method pIndex. Different from pfTree, plndex employs row and column header tables to traverse related branches in a bottom-up manner. Further, two pruning rules based on number and order of keywords are introduced. To reduce the number of column keyword candidates on fuzzy queries, we introduce a First Item of keywords roTation method FIT, which reduces it from n! to n. We conduct extensive experiments with real datasets on a single machine, Hadoop and Hama, and the experimental results show the efficiency and scalability of the proposed techniques.

机译：已证明订单保留次数（OPSM）在建模生物有意义的子空间集群中，捕获了在条件下捕获基因表达的一般趋势。给定根据行或列关键字的OPSM查询，期望通过索引从大型基因表达数据集或OPSM数据快速检索OPSMS。但是，OPSM挖掘从基因表达数据集的时间长，OPSM数据的体积巨大。在本文中，我们调查了上面的两个数据集的问题，并首先通过应用前缀树提出天真的解决方案pftree。由于搜索树是不高效的，我们提供了优化索引方法PINDEX。与PFTree不同，PLNDEX采用行和列标题表以自下而上的方式遍历相关的分支。此外，介绍了基于数量和关键字顺序的两个修剪规则。为了减少模糊查询的列关键字候选的数量，我们介绍了一个关键字旋转方法适合的第一个项目，从而从n减少它！ ñ。我们在一台机器，Hadoop和Hama的实际数据集进行了广泛的实验，实验结果表明了所提出的技术的效率和可扩展性。

著录项

来源
《International conference on database systems for advanced applications》|2015年||共18页
会议地点
作者
Tao Jiang; Zhanhuai Li; Qun Chen; Kaiwen Li; Zhong Wang; Wei Pan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词
OPSM; Gene expression data; pIndex; FIT; Queries;

机译：OPSM;基因表达数据;PINDEX;适合;查询;

相似文献

外文文献
中文文献
专利

1. Constrained query of order-preserving submatrix in gene expression data [J] . Tao JIANG, Zhanhuai LI, Xuequn SHANG, Frontiers of computer science in China . 2016,第6期

机译：基因表达数据中保序子矩阵的约束查询
2. On the Deep Order-Preserving Submatrix Problem: A Best Effort Approach [J] . Gao Byron J., Griffith Obi L., Ester Martin, Knowledge and Data Engineering, IEEE Transactions on . 2012,第2期

机译：关于深阶保留子矩阵问题：尽力而为方法
3. Solving the Order-Preserving Submatrix Problem via Integer Programming [J] . Andrew C. Trapp, Oleg A. Prokopyev INFORMS journal on computing . 2010,第3期

机译：通过整数编程解决保序子矩阵问题
4. Towards Order-Preserving SubMatrix Search and Indexing [C] . Tao Jiang, Zhanhuai Li, Qun Chen, International conference on database systems for advanced applications;International workshop on Semantic computing and personalization;International workshop on big data management and service . 2015

机译：迈向保留订单的子矩阵搜索和索引
5. Enhancing user search experience in digital libraries with rotated latent semantic indexing [D] . Polyakov, Serhiy. 2015

机译：通过旋转的潜在语义索引增强数字图书馆中的用户搜索体验
6. Indexing Theory Indexing Methods and Search Devices [O] . Irwin H. Pizer 1964

机译：索引理论索引方法和搜索设备
7. Indexing and Search of Order-Preserving Submatrix for Gene Expression Data [O] . Tao Jiang, Bolin Chen, Juntao Li, 2019

机译：索引和搜索基因表达数据的秩序保存子址

Towards Order-Preserving SubMatrix Search and Indexing

摘要

著录项

相似文献

相关主题

期刊订阅