首页> 外文期刊>IEEE transactions on visualization and computer graphics >cite2vec: Citation-Driven Document Exploration via Word Embeddings
【24h】

cite2vec: Citation-Driven Document Exploration via Word Embeddings

机译:cite2vec:通过单词嵌入进行引文驱动的文档探索

获取原文
获取原文并翻译 | 示例
           

摘要

Effectively exploring and browsing document collections is a fundamental problem in visualization. Traditionally, document visualization is based on a data model that represents each document as the set of its comprised words, effectively characterizing what the document is. In this paper we take an alternative perspective: motivated by the manner in which users search documents in the research process, we aim to visualize documents via their usage, or how documents tend to be used. We present a new visualization scheme — cite2vec — that allows the user to dynamically explore and browse documents via how other documents use them, information that we capture through citation contexts in a document collection. Starting from a usage-oriented word-document 2D projection, the user can dynamically steer document projections by prescribing semantic concepts, both in the form of phrase/document compositions and document:phrase analogies, enabling the exploration and comparison of documents by their use. The user interactions are enabled by a joint representation of words and documents in a common high-dimensional embedding space where user-specified concepts correspond to linear operations of word and document vectors. Our case studies, centered around a large document corpus of computer vision research papers, highlight the potential for usage-based document visualization.
机译:有效地浏览和浏览文档集合是可视化中的基本问题。传统上,文档可视化基于数据模型,该数据模型将每个文档表示为其包含的单词的集合,从而有效地表征文档的含义。在本文中,我们采用了另一种观点:受用户在研究过程中搜索文档的方式的激励,我们旨在通过文档的使用或文档的使用方式来可视化文档。我们提出了一种新的可视化方案cite2vec,该方案允许用户通过其他文档如何使用它们来动态地浏览和浏览文档,这些信息是我们通过文档集合中的引用上下文捕获的。从面向使用的单词文档2D投影开始,用户可以通过规定语义概念(包括短语/文档组成和document:phrase类比)来动态控制文档投影,从而通过使用文档来探索和比较文档。用户交互是通过共同的高维嵌入空间中单词和文档的联合表示实现的,其中用户指定的概念对应于单词和文档矢量的线性运算。我们的案例研究以计算机视觉研究论文的大型文档集为中心,强调了基于使用情况的文档可视化的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号