【24h】

Information Splitting for Big Data Analytics

机译:大数据分析的信息拆分

获取原文

摘要

Many statistical models require an estimation of unknown (co)-variance parameter(s). The estimation is usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires the observed information-the negative Hessian matrix or the second derivative of the log-likelihood-to obtain an accurate maximum likelihood estimator according to the Newton method. When one uses the Fisher information, the expect value of the observed information, a simpler algorithm than the Newton method is obtained as the Fisher scoring algorithm. With the advance in high-throughput technologies in the biological sciences, recommendation systems and social networks, the sizes of data sets-and the corresponding statistical models-have suddenly increased by several orders of magnitude. Neither the observed information nor the Fisher information is easy to obtained for these big data sets. This paper introduces an information splitting technique to simplify the computation. After splitting the mean of the observed information and the Fisher information, an simpler approximate Hessian matrix for the log-likelihood can be obtained. This approximated Hessian matrix can significantly reduce computations, and makes the linear mixed model applicable for big data sets. Such a spitting and simpler formulas heavily depend on matrix algebra transforms, and applicable to large scale breeding model, genetics wide association analysis.
机译:许多统计模型都需要估算未知的(协)方差参数。通常通过最大化涉及对数行列式项的对数似然来获得估计。原则上,根据牛顿法,需要观察到的信息(负Hessian矩阵或对数似然的二阶导数)才能获得准确的最大似然估计。当使用Fisher信息(观测信息的期望值)时,将获得比Newton方法更简单的算法作为Fisher评分算法。随着生物科学,推荐系统和社交网络中高通量技术的发展,数据集的大小以及相应的统计模型突然增加了几个数量级。对于这些大数据集,无论是观测信息还是Fisher信息都不容易获得。本文介绍了一种信息拆分技术以简化计算。将观测信息和Fisher信息的均值分开后,可以获得对数似然的更简单的近似Hessian矩阵。这种近似的Hessian矩阵可以大大减少计算量,并使线性混合模型适用于大数据集。这种分散和简单的公式在很大程度上取决于矩阵代数变换,并且适用于大规模育种模型,遗传学广泛的关联分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号