Provenance and uncertainty.

机译：种源和不确定性。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Data provenance, a record of the origin and transformation of data, explains how output data is derived from input data. This dissertation focuses on exploring the connection between provenance and uncertainty in two main directions: (1) how a succinct representation of provenance can help infer uncertainty in the input or the output, and (2) how introducing uncertainty can facilitate publishing provenance information while hiding associated private information.;A significant fraction of the data found in practice is imprecise, unreliable, and incomplete, and therefore uncertain. The level of uncertainty in the data must be measured and recorded in order to estimate the confidence in the results and find potential sources of error. In probabilistic databases, uncertainty in the input is recorded as a probability distribution, and the goal is to efficiently compute the induced probability distribution on the outputs. In general, this problem is computationally hard, and we seek to expand the class of inputs for which efficient evaluation is possible by exploiting provenance structure.;In some scenarios, the output data is directly examined for errors and is labeled accordingly. We need to trace back the errors in the output to the input so that the input can be refined for future processing. Because of incomplete labeling of the output and complexity of the processes generating it, the sources of error may be uncertain. We formalize the problem of source refinement, and propose models and solutions using provenance that can handle incomplete labeling. We also evaluate our solutions empirically for an application of source refinement in information extraction .;Data provenance is extensively used to help understand and debug scientific experiments that often involve proprietary and sensitive information. In this dissertation, we consider privacy of proprietary and commercial modules when they belong to a workflow and interact with other modules. We propose a model for module privacy that makes the exact functionality of the modules uncertain by selectively hiding provenance information. We also study the optimization problem of minimizing the information hidden while guaranteeing a desired level of privacy.

机译：数据来源是数据来源和转换的记录，它解释了如何从输入数据中导出输出数据。本文着眼于在两个主要方向上探索物源与不确定性之间的联系：（1）物源的简洁表示如何有助于推断输入或输出中的不确定性；（2）引入不确定性如何在隐藏的同时促进发布物源信息相关的私人信息。；实践中发现的数据中有很大一部分是不准确，不可靠和不完整的，因此不确定。必须测量和记录数据中的不确定性水平，以估计结果的可信度并找到潜在的误差源。在概率数据库中，将输入中的不确定性记录为概率分布，目标是有效地计算输出上的诱导概率分布。通常，此问题在计算上比较困难，我们试图通过利用出处结构来扩展可能进行有效评估的输入类别。在某些情况下，将直接检查输出数据是否存在错误并进行相应标记。我们需要将输出中的错误追溯到输入，以便可以改进输入以供将来处理。由于输出的标签不完整以及生成它的过程的复杂性，错误的来源可能不确定。我们将源头优化问题形式化，并使用可处理不完整标签的出处提出模型和解决方案。我们还根据经验评估我们的解决方案，以在信息提取中应用源优化。;数据来源广泛用于帮助理解和调试通常涉及专有和敏感信息的科学实验。在本文中，当专有和商业模块属于工作流并与其他模块交互时，我们考虑它们的隐私。我们提出了一种模块保密性模型，该模型通过有选择地隐藏出处信息来使模块的确切功能不确定。我们还研究了在确保所需的隐私级别的同时将隐藏信息最小化的优化问题。

著录项

作者
Roy, Sudeepa.;
展开▼
作者单位

University of Pennsylvania.;

展开▼
授予单位 University of Pennsylvania.;
学科 Computer Science.
学位 Ph.D.
年度 2012
页码 303 p.
总页数 303
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Digging into Big Provenance (with SPADE):A user interface for querying provenance [J] . Ashish Gehani, Raza Ahmad, Hassan Irshad, ACM Queue: Architecting Tomorrow s Computing . 2021,第3期

机译：挖掘大货源（带铲子）：用于查询出处的用户界面
2. Preliminary examination for revealing the provenance dependency of the lattice spacing of biotite for the provenance estimation of Atamadai-type pottery (2500-1500 BC) by XRD [J] . Ichikawa Shintaro, Miyamoto Kana, Kurisaki Tsutomu Powder diffraction . 2021,第2期

机译：初步考察揭示了XRD对阿马迪达型陶器（2500-1500 BC）的出处估算的晶格间距的原始依赖性
3. Quality of pedunculate oak Provenances in Bosnian-Herzegovinian provenance test based on branching angle and stem form [J] . Czechoslovak Mathematical Journal . 2020,第2期

机译：基于分支角度和茎形式的波斯尼亚 - 赫尔佐维文源检验的Pedunculate橡木杂种的质量
4. Demand information sharing impact on supply chain management under demand uncertainty. A simulation model [C] . Barroso A.P., Machado V.H., Machado V.C. IEEE International Conference on Industrial Engineering and Engineering Management . 2013

机译：在需求不确定的情况下，需求信息共享对供应链管理的影响。仿真模型
5. Fine-Grained Provenance and Applications to Data Analytics Computation [D] . Zheng, Nan. 2021

机译：细粒度的物质和应用于数据分析计算
6. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data [O] . Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, -1

机译：出差上下文实体（PACE）：科学RDF数据的可扩展性出处跟踪
7. Supporting Secure Provenance Update by Keeping “Provenance” of the Provenance [O] . Syalim, Amril, Nishide, Takashi, Sakurai, Kouichi 2013

机译：通过保留来源的“来源”来支持安全的来源更新

Provenance and uncertainty.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅