首页> 外文学位 >Identifying similar objects in social networks and digital libraries.
【24h】

Identifying similar objects in social networks and digital libraries.

机译:识别社交网络和数字图书馆中的相似对象。

获取原文
获取原文并翻译 | 示例

摘要

With the rise of the computer age, various kinds of information can be easily accessed in digital format. However, the objects found within this information, such as people, places, dates, and firms, form a tangled and complex relationship that is usually challenging to untangle.;In this dissertation, we aim to unravel the relationship among objects to the finest extent: what are the similarity levels between any pairs of objects. Discovering similar objects can be the foundation of several research problems and applications. For example, objects can be clustered into several groups by merging similar objects together. This merging process can be recursively performed such that a hierarchical structure of these terms is constructed. In addition, the hidden relationship among objects can be inferred by examining the similar objects that do not explicitly interact with each other.;This dissertation examines the problem of discovering similar objects in two different settings: (1) discovering similar objects based on the interaction among them, and (2) discovering similar objects based on their meta-data. We will mainly focus on the first setting. The interactions among objects are modeled by a network structure, in which each node represents one object, and an edge is presented if the two objects have interacted with each other. In the second setting, we examine the similarity problem where additional information other than interacting history is available. In the second setting, we targeted digital library objects, such as papers, authors, published venues (i.e., the published conference or journal), etc. The meta-data of these objects could be, for example, the citation counts of the paper, the affiliation of the author, and the topics of the conference. These meta-data are utilized to infer the similar objects, such as similar terms, similar venues, or relevant authors given a topic.;To validate our proposed models and methodologies, we conducted various experiments on several different data sets to discover the hidden relationship among the target objects. This includes (1) the relationship between the authors, papers, and venues in the given digital library, (2) the actors, actresses, and the movies in the given movie information, and (3) the diseases and the genes of patients. In addition, we implemented two live systems based on CiteSeerX digital library to bring several of these research results into practical products. The first system, CollabSeer, recommends potential collaborators based on a user's research interest and previous coauthoring behaviors. The second one, CSSeer, recommends a list of experts given a term of interest based on the similarity score between the query term and the publication and citation history of the authors. Both systems are highly efficient in handling more than one million papers and over 300 thousand disambiguated authors.
机译:随着计算机时代的兴起,可以轻松地以数字格式访问各种信息。但是,在此信息中找到的对象(例如人,地点,日期和公司)形成了纠结而复杂的关系,通常很难解决。;本文旨在最大程度地阐明对象之间的关系。 :任何对象对之间的相似度是多少?发现相似的物体可能是若干研究问题和应用的基础。例如,通过将相似的对象合并在一起,可以将对象分为几组。可以递归执行此合并过程,以便构造这些术语的层次结构。另外,通过检查彼此之间没有明确交互的相似对象,可以推断出对象之间的隐藏关系。本文研究了在两种不同环境下发现相似对象的问题:(1)基于交互作用发现相似对象其中(2)根据元数据发现相似的对象。我们将主要关注第一个设置。对象之间的交互作用是通过网络结构建模的,其中每个节点代表一个对象,如果两个对象相互交互,则会显示一条边。在第二种设置中,我们研究了相似性问题,其中除了交互历史记录之外还可以使用其他信息。在第二种设置中,我们针对数字图书馆对象,例如论文,作者,已出版的场所(即已出版的会议或期刊)等。这些对象的元数据可以是例如论文的引文计数,作者的隶属关系以及会议主题。这些元数据被用来推断相似的对象,例如相似的术语,相似的场所或给定主题的相关作者。为了验证我们提出的模型和方法,我们对几个不同的数据集进行了各种实验以发现隐藏的关系在目标对象之间。这包括(1)给定数字图书馆中作者,论文和场所之间的关系,(2)给定电影信息中的演员,女演员和电影,以及(3)患者的疾病和基因。此外,我们基于CiteSeerX数字图书馆实施了两个实时系统,以将其中的一些研究成果带入实用产品。第一个系统CollabSeer根据用户的研究兴趣和以前的共同创作行为来推荐潜在的合作者。第二个CSSeer根据查询词与作者的出版物和引用历史之间的相似性评分,推荐给定感兴趣词的专家列表。两种系统在处理超过一百万篇论文和超过三十万名消歧作者方面都是高效的。

著录项

  • 作者

    Chen, Hung-Hsuan.;

  • 作者单位

    The Pennsylvania State University.;

  • 授予单位 The Pennsylvania State University.;
  • 学科 Computer Science.;Engineering Computer.;Sociology Social Structure and Development.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 144 p.
  • 总页数 144
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号