首页> 外文期刊>Scientific programming >Information-Balance-Aware Approximated Summarization of Data Provenance
【24h】

Information-Balance-Aware Approximated Summarization of Data Provenance

机译:信息平衡感知的数据来源汇总

获取原文
获取原文并翻译 | 示例

摘要

Extracting useful knowledge from data provenance information has been challenging because provenance information is often overwhelmingly enormous for users to understand. Recently, it has been proposed that we may summarize data provenance items by grouping semantically related provenance annotations so as to achieve concise provenance representation. Users may provide their intended use of the provenance data in terms of provisioning, and the quality of provenance summarization could be optimized for smaller size and closer distance between the provisioning results derived from the summarization and those from the original provenance. However, apart from the intended provisioning use, we notice that more dedicated and diverse user requirements can be expressed and considered in the summarization process by assigning importance weights to provenance elements. Moreover, we introduce information balance index (IBI), an entropy based measurement, to dynamically evaluate the amount of information retained by the summary to check how it suits user requirements. An alternative provenance summarization algorithm that supports manipulation of information balance is presented. Case studies and experiments show that, in summarization process, information balance can be effectively steered towards user-defined goals and requirement-driven variants of the provenance summarizations can be achieved to support a series of interesting scenarios.
机译:从数据出处信息中提取有用的知识一直很困难,因为出处信息对于用户而言往往是极其庞大的。最近,有人提出我们可以通过对语义相关的出处注释进行分组来汇总数据出处项目,从而实现简洁的出处表示。用户可以在供应方面提供其对来源数据的预期用途,并且可以优化来源汇总的质量,以使摘要得出的供应结果与原始来源的结果之间的距离更小,距离更近。但是,除了预期的供应用途外,我们注意到,通过为来源元素分配重要性权重,可以在汇总过程中表达和考虑更多专门的用户需求。此外,我们引入了信息平衡指数(IBI)(一种基于熵的度量),以动态评估摘要保留的信息量,以检查其是否适合用户需求。提出了一种支持信息平衡处理的替代物源汇总算法。案例研究和实验表明,在汇总过程中,可以有效地将信息平衡导向用户定义的目标,并且可以实现需求驱动的出处汇总形式,以支持一系列有趣的场景。

著录项

  • 来源
    《Scientific programming》 |2017年第2期|4504589.1-4504589.11|共11页
  • 作者

    Pei Jisheng; Ye Xiaojun;

  • 作者单位

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China;

    Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China;

  • 收录信息 美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号