Information retrieval via universal source coding.

机译：通过通用源代码进行信息检索。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation explores the intersection of information retrieval and universal source coding techniques and studies an optimal multidimensional source representation from an information theoretic point of view. Previous research on information retrieval particularly focus on learning probabilistic or deterministic source models based on primarily two different types of source representations, e.g., fixed-shape partitions or uniform regions. We study the limitations of the conventional source representations on capturing the semantics of the given multidimensional source sequences and propose a new type of primitive source representation generated by a universal source coding technique. We propose a multidimensional incremental parsing algorithm extended from the Lempel-Ziv incremental parsing and its three component schemes for multidimensional source coding. The properties of the proposed coding algorithm are exploited under two-dimensional lossless and lossy source coding. By the proposed coding algorithm, a given multidimensional source sequence is parsed into a number of variable-size patches. We call this methodology a parsed representation.;Based on the source representation, we propose an information retrieval framework that analyzes a set of source sequences under a linguistic processing technique and implemented content-based image retrieval systems. We examine the relevance of the proposed source representation by comparing it with the conventional representation of visual information. To further extend the proposed framework, we apply a probabilistic linguistic processing technique to modeling the latent aspects of a set of documents. In addition, beyond the symbol-wise pattern matching paradigm employed in the source coding and the image retrieval systems, we devise a robust pattern matching that compares the first- and second-order statistics of source patches. Qualitative and quantitative analysis of the proposed framework justifies the superiority of the proposed information retrieval framework based on the parsed representation. The proposed source representation technique and the information retrieval frameworks encourage future work in exploiting a systematic way of understanding multidimensional sources that parallels a linguistic structure.

机译：本文探讨了信息检索与通用源编码技术的交叉，并从信息理论的角度研究了最佳的多维源表示。先前关于信息检索的研究特别关注于主要基于两种不同类型的源表示（例如，固定形状的分区或统一区域）来学习概率或确定性源模型。我们研究了常规源表示形式在捕获给定多维源序列的语义方面的局限性，并提出了一种由通用源编码技术生成的新型原始源表示形式。我们提出了从Lempel-Ziv增量分析及其多维数据源编码的三个组件方案扩展而来的多维增量分析算法。在二维无损和有损源编码下，利用所提出的编码算法的性质。通过提出的编码算法，将给定的多维源序列解析为多个可变大小的补丁。我们称这种方法为解析表示。基于源表示，我们提出了一种信息检索框架，该框架利用语言处理技术分析了一组源序列，并实现了基于内容的图像检索系统。我们通过将其与视觉信息的常规表示形式进行比较来检验所提议的源表示形式的相关性。为了进一步扩展提议的框架，我们应用了一种概率语言处理技术来对一组文档的潜在方面进行建模。此外，除了在源代码编码和图像检索系统中采用的按符号方式的模式匹配范式之外，我们还设计了一种鲁棒的模式匹配，可比较源补丁的一阶和二阶统计量。对所提出框架的定性和定量分析证明了所提出的基于解析表示的信息检索框架的优越性。所提出的源表示技术和信息检索框架鼓励了未来的工作，即利用一种理解类似于语言结构的多维源的系统方式。

著录项

作者
Bae, Soo Hyun.;
展开▼
作者单位

Georgia Institute of Technology.;

展开▼
授予单位 Georgia Institute of Technology.;
学科 Engineering Electronics and Electrical.;Computer Science.
学位 Ph.D.
年度 2008
页码 130 p.
总页数 130
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Uprobe 2008: an online resource for universal overgo hybridization-based probe retrieval and design. [J] . Sullivan Robert T., Morehouse Caroline B., Thomas James W. Nucleic Acids Research . 2008,第2期

机译：Uprobe 2008：在线资源，用于基于通用杂交的探针检索和设计。
2. Uprobe 2008: an online resource for universal overgo hybridization-based probe retrieval and design [J] . Caroline B. Morehouse, James W. Thomas, Robert T. Sullivan Nucleic acids research . 2008,第suppla2期

机译：Uprobe 2008：基于通用基于杂交的探针检索和设计的在线资源
3. Learning a multi-dimensional companding function for lossy source coding. [J] . Maeda S, Ishii S Neural Networks: The Official Journal of the International Neural Network Society . 2009,第7期

机译：学习用于有损源代码编码的多维压扩函数。
4. Universal capacity of channels with given rate-distortion in absence of common randomness, and failure of universal source-channel separation [C] . Mukul Agarwal, Swastik Kopparty, Sanjoy Mitter Annual allerton conference on communication control, and computing;Allerton conference on communication control, and computing;Allerton 2009 . 2009

机译：在没有常见随机性的情况下具有给定速率失真的信道的通用容量，以及通用源-信道分离失败
5. Fast parallel algorithms for universal lossless source coding. [D] . Baron, Dror. 2003

机译：用于通用无损源编码的快速并行算法。
6. Uprobe 2008: an online resource for universal overgo hybridization-based probe retrieval and design [O] . Robert T. Sullivan, Caroline B. Morehouse, James W. Thomas 2008

机译：Uprobe 2008：基于通用基于杂交的探针检索和设计的在线资源
7. Uprobe 2008: an online resource for universal overgo hybridization-based probe retrieval and design† [O] . Sullivan, Robert T., Morehouse, Caroline B., Thomas, James W. 2008

机译：Uprobe 2008：基于通用基于杂交的探针检索和设计的在线资源†

Information retrieval via universal source coding.

摘要

著录项

相似文献

相关主题

期刊订阅