首页> 外文OA文献 >Correlation and regression in contingency tables. A measure of association or correlation in nominal data (contingency tables), using determinants
【2h】

Correlation and regression in contingency tables. A measure of association or correlation in nominal data (contingency tables), using determinants

机译:列联表中的相关和回归。使用决定因素衡量名义数据(列联表)中的关联或相关性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Nominal data currently lack a correlation coefficient, such as has already defined for real data. A measure is possible using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table and n ≤ m the suggested measure is r = Sqrt[det[A'A]] with A = Normalized[M]. With M an n1 × n2 × ... × nk contingency matrix, we can construct a matrix of pairwise correlations R. A matrix of such pairwise correlations is called an association matrix. If that matrix is also positive semi-definite (PSD) then it is a proper correlation matrix. The overall correlation then is R = f[R] where f can be chosen to impose PSD-ness. An option is to use f[R] = Sqrt[1 - det[R]]. However, for both nominal and cardinal data the advisable choice is to take the maximal multiple correlation within R. The resulting measure of “nominal correlation” measures the distance between a main diagonal and the off-diagonal elements, and thus is a measure of strong correlation. Cramer’s V measure for pairwise correlation can be generalized in this manner too. It measures the distance between all diagonals (including cross-diagaonals and subdiagonals) and statistical independence, and thus is a measure of weaker correlation. Finally, when also variances are defined then regression coefficients can be determined from the variance-covariance matrix. The volume ratio measure can be related to the regression coefficients, not of the variables, but of the categories in the contingency matrix, using the conditional probabilities given the row and column sums.
机译:当前名义数据缺乏相关系数,例如已经为真实数据定义的相关系数。使用行列式可以进行度量,并且有用的解释是行列式给出了体积之间的比率。在M a m×n列联表和n≤m的情况下,建议的度量为r = Sqrt [det [A'A]],而A =规范化[M]。利用M n1×n2×...×nk列矩阵,我们可以构建成对相关矩阵R。具有成对相关矩阵的矩阵称为关联矩阵。如果该矩阵也是正半定(PSD),则它是适当的相关矩阵。这样,整体相关系数为R = f [R],其中可以选择f来施加PSD强度。一种选择是使用f [R] = Sqrt [1-det [R]]。但是,对于名义数据和基数数据,明智的选择是采用R内的最大倍数相关性。“名义相关性”的所得度量用于度量主对角线与非对角线元素之间的距离,因此是对强对角线的度量。相关性。 Cramer的成对相关性的V量度也可以用这种方式推广。它测量了所有对角线(包括对角线和对角线对角线)之间的距离和统计独立性,因此是对相关性较弱的一种度量。最后,如果还定义了方差,则可以从方差-协方差矩阵确定回归系数。使用给定行和列总和的条件概率,体积比率度量可以与回归系数相关,而与变量而不是偶然矩阵中的类别相关。

著录项

  • 作者

    Colignatus Thomas;

  • 作者单位
  • 年度 2007
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号