首页> 外文学位 >Mining latent entity structures from massive unstructured and interconnected data.
【24h】

Mining latent entity structures from massive unstructured and interconnected data.

机译:从大量的非结构化和互连数据中挖掘潜在实体结构。

获取原文
获取原文并翻译 | 示例

摘要

The "big data" era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured but interconnected data. Mining latent structured information around entities uncovers semantic structures from massive unstructured data and hence enables many high-impact applications, including taxonomy or knowledge base construction, multi-dimensional data analysis and information or social network analysis.;A mining framework is proposed, to solve and integrate a chain of tasks: hierarchical topic discovery, topical phrase mining, entity role analysis and entity relation mining. It reveals two main forms of structures: topical and relational structures. The topical structure summarizes the topics associated with entities with various granularity, such as the research areas in computer science. The framework enables recursive construction of phrase-represented and entity-enriched topic hierarchy from text-attached information networks. It makes breakthrough in terms of quality and computational efficiency. The relational structure recovers the hidden relationship among entities, such as advisor-advisee. A probabilistic graphical modeling approach is proposed. The method can utilize heterogeneous attributes and links to capture all kinds of semantic signals, including constraints and dependencies, to recover the hierarchical relationship with the best known accuracy.
机译:“大数据”时代的特点是,信息以数字数据收集的形式激增,范围从科学知识到社交媒体,新闻以及每个人的日常生活。关于多类型实体的宝贵知识通常隐藏在非结构化或松散结构但相互关联的数据中。围绕实体挖掘潜在的结构化信息可以从大量的非结构化数据中发现语义结构,因此可以实现许多高影响力的应用程序,包括分类法或知识库构建,多维数据分析以及信息或社交网络分析。并集成了一系列任务:分层主题发现,主题短语挖掘,实体角色分析和实体关系挖掘。它揭示了结构的两种主要形式:主题结构和关系结构。主题结构总结了与各种粒度的实体相关的主题,例如计算机科学的研究领域。该框架能够从附有文本的信息网络中递归构造短语表示和实体丰富的主题层次结构。它在质量和计算效率方面取得了突破。关系结构恢复了实体之间的隐藏关系,例如Advisor-advisee。提出了一种概率图形建模方法。该方法可以利用异构属性和链接来捕获所有种类的语义信号,包括约束和依赖性,以最准确的精度恢复层次关系。

著录项

  • 作者

    Wang, Chi.;

  • 作者单位

    University of Illinois at Urbana-Champaign.;

  • 授予单位 University of Illinois at Urbana-Champaign.;
  • 学科 Computer science.;Artificial intelligence.;Information science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 168 p.
  • 总页数 168
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号