首页> 外文学位 >Statistical tools for disclosure limitation in multi-way contingency tables.
【24h】

Statistical tools for disclosure limitation in multi-way contingency tables.

机译:多向列联表中限制披露的统计工具。

获取原文
获取原文并翻译 | 示例

摘要

Disseminating information from a k-way cross-classification of non-negative counts n typically corresponds to the release of various lower order marginals, or equivalently subsets of the k variables. This thesis exploits the theory of graphical models to characterize classes of tables W induced by several possibly overlapping marginals. Any set of released marginals can be used to define an independence graph. In the special case when these marginals correspond to the minimal sufficient statistics of a decomposable graphical model, the theory yields explicit formulas for sharp upper and lower bounds on the cell entries of tables in W . In addition, these bound results are related to the Markov basis used to induce probability distributions over W . In the decomposable case, simple “data swaps” or moves are the only moves required to construct a Markov basis that links all the contingency tables in W . The approach for computing bounds and for generating Markov bases developed in the thesis generalizes to the case when the released marginals correspond to a reducible independence graph. For an arbitrary set of released marginals, explicit formulas for sharp bounds are not available, and some form of iterative algorithm is required. The method developed in this thesis is a generalization of the shuttle algorithm proposed by Buzzigoli and Giusti. This generalized shuttle algorithm can be modified to enumerate all the tables in the class W and to find a controlled rounding of a table of arbitrary dimension. The last part of this thesis studies probability distribution functions defined on spaces of tables induced by a set of marginal totals. Through examples and discussion the thesis illustrates the practical values of the bound and distribution results for assessing the disclosure risk for categorical data.
机译:从非负计数 n k 交叉分类中传播信息通常对应于释放各种较低阶边际或 k的等效子集变量。本文利用图形化模型的理论来刻画由几个可能重叠的边际引起的表 W 的类。可以使用任何一组已发布的边际量来定义独立性图。在特殊情况下,当这些边际对应于可分解图形模型的最小充分统计量时,该理论为 W 。此外,这些绑定结果与用于在 W 上引起概率分布的马尔可夫基础有关。在可分解的情况下,简单的“数据交换”或移动是构造将链接所有 W W 。本文开发的计算边界和生成马尔可夫基数的方法推广到释放边际对应于可约独立图的情况。对于任意一组已发布的边际,没有用于尖锐边界的显式公式,因此需要某种形式的迭代算法。本文提出的方法是对Buzzigoli和Giusti提出的穿梭算法的概括。可以修改此通用的穿梭算法,以枚举 W 类中的所有表,并找到任意维度的表的受控舍入。本文的最后一部分研究了由一组边际总数引发的在表空间上定义的概率分布函数。通过实例和讨论,本文阐述了边界和分布结果对评估分类数据披露风险的实用价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号