Querying, Exploring and Mining the Extended Document.

机译：查询，浏览和挖掘扩展文档。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The evolution of the Web into an interactive medium that encourages active user engagement has ignited a huge increase in the amount, complexity and diversity of available textual data. This evolution forces us to reevaluate our view of documents as simple pieces of text and of document collections as immutable and isolated. Extended documents published in the context of blogs, micro-blogs, on-line social networks, customer feedback portals, can be associated with a wealth of meta-data in addition to their textual component: tags, links, sentiment, entities mentioned in text, etc. Collections of user-generated documents grow, evolve, co-exist and interact: they are dynamic and integrated.;For collections of socially annotated extended documents, we present an improved probabilistic search and ranking approach based on our growing understanding of the dynamics of the social annotation process.;For extended documents, such as blog posts, associated with entities extracted from text and categorical attributes, we enable their interactive exploration through the efficient computation of strong entity associations. Associated entities are computed for all possible attribute value restrictions of the document collection.;For extended documents, such as user reviews, annotated with a numerical rating, we introduce a keyword-query refinement approach. The solution enables the interactive navigation and exploration of large result sets.;These unique characteristics of modern documents and document collections present us with exciting opportunities for improving the way we interact with them. At the same time, this additional complexity combined with the vast amounts of available textual data present us with formidable computational challenges. In this context, we introduce, study and extensively evaluate an array of effective and efficient solutions for querying, exploring and mining extended documents, dynamic and integrated document collections.;We extend the skyline query to document streams, such as news articles, associated with categorical attributes and partially-ordered domains. The technique incrementally maintains a small set of recent, uniquely interesting extended documents from the stream.;Finally, we introduce a solution for the scalable integration of structured data sources into Web search. Queries are analyzed in order to determine what structured data, if any, should be used to augment Web search results.

机译：Web演变为鼓励积极的用户参与的交互式媒体，已经点燃了可用文本数据的数量，复杂性和多样性的巨大增长。这种演变迫使我们重新评估我们对文档的看法，认为它们是简单的文本，而文档集合则是不可变的和孤立的。在博客，微博客，在线社交网络，客户反馈门户等上下文中发布的扩展文档，除了其文本部分外，还可以与大量元数据相关联：标签，链接，情感，文本中提及的实体用户生成的文档的集合不断增长，发展，共存和交互：它们是动态的和集成的。对于社会化注释的扩展文档的集合，我们基于对文本的不断增长的理解，提出了一种改进的概率搜索和排名方法社会注释过程的动态。对于与从文本和类别属性中提取的实体相关联的扩展文档（例如博客帖子），我们通过有效计算强实体关联来启用其交互式探索。为文档集合的所有可能的属性值限制计算关联实体。;对于扩展的文档（例如，带有数字评分的用户评论），我们引入了关键字查询优化方法。该解决方案使交互式导航和浏览大型结果集成为可能。现代文档和文档集合的这些独特特性为我们提供了令人兴奋的机会，可以改善我们与它们之间的交互方式。同时，这种额外的复杂性与大量可用的文本数据相结合，给我们带来了巨大的计算挑战。在这种情况下，我们引入，研究和广泛评估了一系列有效，高效的解决方案，用于查询，探索和挖掘扩展文档，动态和集成的文档集合。;我们将天际线查询扩展到与以下内容相关的文档流，例如新闻文章分类属性和部分排序的域。该技术从流中增量地维护了一小组最近的，独特的，有趣的扩展文档。最后，我们引入了一种解决方案，用于将结构化数据源可伸缩地集成到Web搜索中。对查询进行分析，以确定应使用哪些结构化数据（如果有）来增强Web搜索结果。

著录项

作者
Sarkas, Nikolaos.;
展开▼
作者单位

University of Toronto (Canada).;

展开▼
授予单位 University of Toronto (Canada).;
学科 Computer Science.
学位 Ph.D.
年度 2011
页码 204 p.
总页数 204
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Mining Related Queries from Web Search Engine Query Logs Using an Improved Association Rule Mining Model [J] . Xiaodong Shi, Christopher C. Yang Journal of the American Society for Information Science and Technology . 2007,第12期

机译：使用改进的关联规则挖掘模型从Web搜索引擎查询日志中挖掘相关查询
2. Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation [J] . Atul Thakare, Srijay Deshpande, Amit Kshirsagar, Computing and informatics . 2019,第2期

机译：挖掘查询计划，以在没有多维数据集生成的BI系统中查找物化视图的候选查询和子查询
3. MINING QUERY PLANS FOR FINDING CANDIDATE QUERIES AND SUB-QUERIES FOR MATERIALIZED VIEWS IN BI SYSTEMS WITHOUT CUBE GENERATION [J] . Thakare Atul, Deshpande Srijay, Kshirsagar Amit, Computing and informatics . 2019,第2期

机译：在没有多维数据集生成的情况下，BI系统中用于查找候选查询和子查询的挖掘查询计划
4. Ziggy: Characterizing Query Results for Data Explorers [C] . Thibault Sellam, Martin Kersten International conference on very large data bases . 2016

机译：Ziggy：表征数据浏览器的查询结果
5. Extending APEx (Accuracy-Aware Differentially Private Data Exploration) to Multiple Table Queries [D] . Singh, Karmanjot. 2020

机译：将APEX（准确性感知差别私有数据探索）扩展到多个表查询
6. PhyloExplorer: a web server to validate explore and query phylogenetic trees [O] . Vincent Ranwez, Nicolas Clairon, Frédéric Delsuc, 2009

机译：PhyloExplorer：用于验证探索和查询系统发生树的Web服务器
7. Using and extending itemsets in data mining : query approximation, dense itemsets, and tiles [O] . Seppänen Jouni K. 2006

机译：在数据挖掘中使用和扩展项目集：查询近似，密集项目集和切片
8. Applicability of the beamed power concept to lunar rovers, construction, mining, explorers and other mobile equipment [R] . Christian, Jose L., Jr. 1989

机译：适用于月球漫游车，建筑，采矿，探险家和其他移动设备的横梁动力概念

Querying, Exploring and Mining the Extended Document.

摘要

著录项

相似文献

相关主题

期刊订阅