首页> 外文学位 >A guided, low-latency, and relevance propagation framework for interactive multimedia search.
【24h】

A guided, low-latency, and relevance propagation framework for interactive multimedia search.

机译:交互式多媒体搜索的指导性,低延迟和相关性传播框架。

获取原文
获取原文并翻译 | 示例

摘要

This thesis investigates a number of problems associated with the efficient and engaging ways of executing a multi-level interactive multimedia search. These problems are of interest as the availability of multimedia sources, both professional and personal, continues to grow in tandeom with the need for users to search these libraries for consumable entertainment, captured personal memories, and automatically events with little or no forethought to manual indexing.;The goal of this thesis is to develop a framework that both guides the user through his or her search process by providing dynamic suggestions and information from automatic algorithms while simultaneously leveraging cues observed during the search process to provide a customized set of results that most precisely matches the user's search target. Upon achieving this goal, the system is aiding the user through both explicit interaction and subsequent result personalization from implicit search choices. A prototype of the proposed system, called CuZero, has been implemented and evaluated across multiple challenging databases to discover new search techniques previously unavailable.;Addressing problems in traditional query formulation, a system that interactively guides the user is proposed. While previous works allow a user to specify different modalities for a multimedia search like textual keywords and image examples, this work also introduces a large library of 374 semantic concepts. Semantic concepts use pre-trained visual models to bridge the gap in perception between what a machine computes for a multimedia document and what a user can do with that computation. For example, a user need only utilize the concept "crowd" to return content containing large numbers of people attending a basketball tournament, a political protest, or an exclusive fashion show. Building on the familiar technique of text entry (typing in text keywords), the system returns a small subset of dynamically suggested concepts from a lexical mapping and statistical expansion of the user's entered text. These suggestions both engage and inform the user about what the system has indexed with respect to the current query text. Additionally, the introduction of a unique query visualization panel allows the user to interactively include arbitrary modalities (text, images, concepts, etc.) in his or her query.;After a query is formulated during a guided and informative process, the formulation panel is subsequently utilized for query navigation, allowing the user to instantly review numerous query permutations with no perceived latency. With the intuitive mantra "closer to something is more like it", the user is prepared to instantly change the weights of the various parameters in his or her query. To accommodate this flexibility, previous systems in interactive search resorted to burdening the user with a secondary query specification stage to tweak individual modality weights. However, the proposed approach to result browsing allows the user to navigate the query and result space in parallel, spanning a wide breadth of query permutations or a deep result depth for any one query permutation. Another classic barrier in multimedia search is the sensible inclusion of new search modalities; if no longer constrained to color or text cues, how can one include motion, audio, and local object similarity that has no textual correspondence? Fortunately, the proposed query navigation panel was created in such a way that any modalities developed in the future can be included with no additional algorithmic changes. This flexibility is best exemplified during the result browsing process, where a user can include another image for example-based search or a personalized snapshot of seen results into the query to quickly hone in on desirable results.;A final proposal in this work is a scalable and real-time result personalization technique. In this work, state-of-the-art graph-based label propagation is aided by data approximation techniques, in a proposed algorithm that is able to achieve higher accuracy in only a small fraction of the computation time when evaluated on a standard benchmark dataset. Using the real-time implementation of this technique, user search results can be personalized without the need to solicit result preferences en mass. (Abstract shortened by UMI.)
机译:本文研究了与执行多层交互式多媒体搜索的有效且引人入胜的方式相关的许多问题。这些问题引起了人们的关注,因为专业人士和个人的多媒体资源的可用性在不断增长,需要用户搜索这些库中的娱乐性消费,捕获的个人记忆以及很少或不需要手动索引的自动事件本文的目标是开发一个框架,该框架可通过提供自动算法的动态建议和信息来引导用户完成其搜索过程,同时利用搜索过程中观察到的线索来提供一组定制的结果,这些结果大多数与用户的搜索目标完全匹配。在实现此目标后,系统将通过显式交互以及来自隐式搜索选择的后续结果个性化来帮助用户。所提出系统的原型CuZero已在多个具有挑战性的数据库中实施和评估,以发现以前不可用的新搜索技术。解决传统查询表述中的问题,提出了一种交互式指导用户的系统。尽管以前的作品允许用户为多媒体搜索指定不同的模式,例如文本关键字和图像示例,但该作品还引入了包含374个语义概念的大型库。语义概念使用预先训练的视觉模型来弥合机器对多媒体文档进行的计算与用户可以进行的计算之间的感知差距。例如,用户仅需要利用“人群”概念来返回包含大量参加篮球比赛,政治抗议或独家时装表演的人的内容。基于熟悉的文本输入技术(键入文本关键字),系统从词法映射和用户输入文本的统计扩展中返回一小部分动态建议概念。这些建议既可以吸引用户,也可以通知用户有关系统针对当前查询文本进行索引的内容。此外,独特的查询可视化面板的引入使用户可以交互地在他或她的查询中包括任意模态(文本,图像,概念等)。在指导性和信息化过程中制定了查询之后,公式面板随后使用游标进行查询导航,使用户可以立即查看众多查询排列,而不会感觉到等待时间。借助直观的口头禅“更接近某事物更像它”,用户准备立即更改其查询中各种参数的权重。为了适应这种灵活性,交互式搜索中的先前系统会在第二查询规范阶段给用户增加负担,以调整各个模式的权重。但是,所提出的结果浏览方法允许用户并行浏览查询和结果空间,跨越很宽的查询排列范围或对任何一个查询排列都具有很深的结果深度。多媒体搜索中的另一个经典障碍是合理包含新的搜索方式。如果不再局限于颜色或文本提示,那么如何包含没有文本对应关系的运动,音频和局部对象相似性呢?幸运的是,所创建的查询导航面板的创建方式使得将来开发的任何模式都可以包括在内,而无需进行其他算法更改。在结果浏览过程中最好地体现了这种灵活性,用户可以在其中将基于示例的搜索或查看结果的个性化快照添加到另一幅图像中,以快速磨练所需的结果。可扩展的实时结果个性化技术。在这项工作中,最新的基于图形的标签传播借助数据近似技术来辅助,该算法在标准基准数据集上进行评估时,仅在很小的计算时间内就能够实现更高的准确性。 。使用该技术的实时实现,可以个性化用户搜索结果,而无需大幅度征求结果偏好。 (摘要由UMI缩短。)

著录项

  • 作者

    Zavesky, Eric.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Engineering Electronics and Electrical.;Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 158 p.
  • 总页数 158
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号