首页> 外文学位 >On improving information retrieval performance from structured, semi-structured and un-structured information sources.

【24h】

On improving information retrieval performance from structured, semi-structured and un-structured information sources.

机译：关于提高结构化，半结构化和非结构化信息源的信息检索性能。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The field of unstructured data retrieval for simple data types such as text and structured data retrieval in relational data models for transactional processing has already been well researched and commercially developed. However, more complex data types and models such as XML (as semi-structured data), data warehouses (as structured data), images (as unstructured data), etc. pose additional research challenges. The goal of this work is to address such information retrieval performance issues and challenges.; As XML is an evolving semi-structured data representation format, techniques for indexing and retrieval of XML data are drawing increasing attention. We have proposed a memory-efficient index structure and an efficient algorithm for incremental indexing of XML document collections. The experimental results show that our proposed index structure outperforms earlier schemes in terms of indexing time and storage requirements.; Given the growth in size of image collections over the last few years, Content-Based Image Retrieval (CBIR) systems are required to effectively and efficiently access images using information contained in them. Perception-based image retrieval, on the other hand, plays an important role in overcoming some of the semantic problems associated with CBIR. We have proposed a method that uses the concept of Inverse Image Frequency for perception-based color image quantization to improve traditional quantization schemes. Additionally, a cluster-based approach for efficient CBIR that uses a similarity-preserving space transformation method is proposed. Our results show that it offers superior response time with sufficiently high retrieval accuracy.; Lastly, for improving online analytical processing, our focus has been on the more challenging and evolving multidimensional data model. Earlier work does not completely address performance issues, such as query response time and view maintenance time, in data warehouses. We propose a hybrid approach for the selection of views that combines the improved response time of the static approach and the automated tuning capability of the dynamic approach. Experimental results show that the hybrid approach outperforms both the static and the dynamic approaches to view selection.; For future work, we suggest the integration of our results in these different areas and the evaluation of their applicability to real-life multimodal systems applications.

机译：对于简单数据类型（如文本）的非结构化数据检索领域以及用于事务处理的关系数据模型中的结构化数据检索领域，已经进行了充分的研究和商业开发。但是，更复杂的数据类型和模型（例如XML（作为半结构化数据），数据仓库（作为结构化数据），图像（作为非结构化数据）等）提出了额外的研究挑战。这项工作的目标是解决此类信息检索性能问题和挑战。由于XML是一种发展中的半结构化数据表示格式，因此XML数据的索引和检索技术引起了越来越多的关注。我们提出了一种内存有效的索引结构和一种用于XML文档集合增量索引的有效算法。实验结果表明，在索引时间和存储要求方面，我们提出的索引结构优于早期方案。鉴于最近几年来图像收藏的规模不断增长，需要基于内容的图像检索（CBIR）系统，以使用其中包含的信息来有效地访问图像。另一方面，基于感知的图像检索在克服与CBIR相关的一些语义问题中起着重要作用。我们提出了一种方法，该方法使用逆图像频率的概念进行基于感知的彩色图像量化，以改进传统的量化方案。此外，提出了一种基于簇的有效CBIR方法，该方法使用了一种保留相似性的空间变换方法。我们的结果表明，它提供了出色的响应时间以及足够高的检索精度。最后，为了改善在线分析处理，我们的重点一直放在更具挑战性和不断发展的多维数据模型上。早期的工作不能完全解决数据仓库中的性能问题，例如查询响应时间和视图维护时间。我们提出了一种用于选择视图的混合方法，该方法结合了静态方法的改进响应时间和动态方法的自动调整功能。实验结果表明，混合方法优于静态和动态方法进行视图选择。对于将来的工作，我们建议将我们的结果整合到这些不同领域中，并对它们在现实多模态系统应用中的适用性进行评估。

著录项

作者
Shah, Biren N.;
展开▼
作者单位

University of Louisiana at Lafayette.;

展开▼
授予单位 University of Louisiana at Lafayette.;
学科 Computer Science.
学位 Ph.D.
年度 2005
页码 183 p.
总页数 183
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Partial retrieval of compressed semi-structured documents [J] . Ashutosh Gupta, Suneeta Agarwal International Journal of Computer Applications in Technology . 2010,第4期

机译：部分检索压缩的半结构化文档
2. キーワード間の関係を明示的に利用した半構造検索モデル%A Retrieval Model for Explicit Representation of Relationships among Keywords from Semi-Structured Data [J] . 依田　平, 清光　英成, 大月　一弘, 電子情報通信学会技術研究報告 . 2008,第93期

机译：％半结构化数据中关键词之间关系的显式表示检索模型
3. A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management [J] . S. Liu, C.A. McMahon, S.J. Culley Computers in Industry . 2008,第1期

机译：审查结构化文档检索（SDR）技术以改善工程文档管理中的信息访问性能
4. Information extraction from semi-structured and un-structured documents using probabilistic context free grammar inference [C] . Thakur Ramesh, Jain Suresh, Chaudhari Narendra S., Information Retrieval amp; Knowledge Management (CAMP), 2012 International Conference on . 2012

机译：使用概率上下文无关文法推理从半结构化和非结构化文档中提取信息
5. Document similarity and structure: Using bibliometric methods and index terms as approaches to improving information retrieval performance [D] . Hasibuan, Zainal Arifin 1995

机译：文档的相似性和结构：使用文献计量法和索引词作为改善信息检索性能的方法
6. The Canadian Occupational Performance Measure’s semi-structured interview: its applicability to lumbar spinal fusion patients. A prospective randomized clinical study [O] . Lisa Gregersen Oestergaard, Thomas Maribo, Cody Erik Bünger, 2012

机译：加拿大职业绩效测评的半结构化访谈：它适用于腰椎融合患者。前瞻性随机临床研究
7. Improving Performance of DOM in Semi-structured Data Extraction using WEIDJ Model [O] . Ily Amalina Ahmad Sabri, Mustafa Man 2018

机译：使用Weidj模型提高DOM在半结构化数据提取中的性能

On improving information retrieval performance from structured, semi-structured and un-structured information sources.

摘要

著录项

相似文献

相关主题

期刊订阅