首页> 外文会议>SIAM International Conference on Data Mining >Block-LDA: Jointly modeling entity-annotated text and entity-entity links
【24h】

Block-LDA: Jointly modeling entity-annotated text and entity-entity links

机译:Block-LDA:联合建模实体注释文本和实体实体链接

获取原文

摘要

Identifying latent groups of entities from observed interactions between pairs of entities is a frequently en-countered problem in areas like analysis of protein inter-actions and social networks. We present a model that combines aspects of mixed membership stochastic block models and topic models to improve entity-entity link modeling by jointly modeling links and text about the entities that are linked. We apply the model to two datasets: a protein-protein interaction (PPI) dataset supplemented with a corpus of abstracts of scientific publications annotated with the proteins in the PPI dataset and an Enron email corpus. The model is evaluated by inspecting induced topics to understand the nature of the data and by quantitative methods such as functional category prediction of proteins and perplexity which exhibit improvements when joint modeling is used over baselines that use only link or text information.
机译:识别来自对实体对之间观察到的相互作用的实体潜在的实体是一个常规的蛋白质行为和社交网络分析的区域中的常规问题。我们提出了一种模型,它结合了混合成员身份随机块模型和主题模型的方面,以通过联合建模链接和文本了解有关链接的实体的文本来改进实体实体链路建模。我们将模型应用于两个数据集:蛋白质 - 蛋白质相互作用(PPI)数据集补充有PPI数据集中的蛋白质的科学出版物的摘要语料库和安然电子邮件语料库。通过检查诱导主题来评估模型,以了解数据的性质以及通过数量方法,例如蛋白质的功能类别预测,并且当使用仅使用链接或文本信息的基线使用联合建模时,表现出改进的困惑。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号