Identifying Linear Models in Multi-Resolution Population Data Using Minimum Description Length Principle to Predict Household Income

Amornbunchornvej Chainarong; Surasvadi Navaporn; Plangprasopchok Anon; Thajchayapong Suttipong

首页> 外文期刊>ACM transactions on knowledge discovery from data >Identifying Linear Models in Multi-Resolution Population Data Using Minimum Description Length Principle to Predict Household Income

【24h】

Identifying Linear Models in Multi-Resolution Population Data Using Minimum Description Length Principle to Predict Household Income

机译：使用最小描述长度原则识别多分辨率群体数据中的线性模型，以预测家庭收入

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

One shirt size cannot fit everybody, while we cannot make a unique shirt that fits perfectly for everyone because of resource limitations. This analogy is true for policy making as well. Policy makers cannot make a single policy to solve all problems for all regions because each region has its own unique issue. At the other extreme, policy makers also cannot make a policy for each small village due to resource limitations. Would it be better if we can find a set of largest regions such that the population of each region within this set has common issues and we can make a single policy for them? In this work, we propose a framework using regression analysis and Minimum Description Length (MDL) to find a set of largest areas that have common indicators, which can be used to predict household incomes efficiently. Given a set of household features, and a multi-resolution partition that represents administrative divisions, our framework reports a set C* of largest subdivisions that have a common predictive model for population-income prediction. We formalize the problem of finding C* and propose an algorithm that can find C* correctly. We use both simulation datasets as well as a real-world dataset of Thailand's population household information to demonstrate our framework performance and application. The results show that our framework performance is better than the baseline methods. Moreover, we demonstrate that the results of our method can be used to find indicators of income prediction for many areas in Thailand. By adjusting these indicator values via policies, we expect people in these areas to gain more incomes. Hence, the policy makers will be able to make policies by using these indicators in our results as a guideline to solve low-income issues. Our framework can be used to support policy makers in making policies regarding any other dependent variable beyond income in order to combat poverty and other issues. We provide the R package, MRReg, which is the implementation of our framework in the R language. The MRReg package comes with a documentation for anyone who is interested in analyzing linear regression on multi-resolution population data.

机译：一件衬衫尺寸不能符合每个人，而我们不能制作一个独特的衬衫，因为资源限制而适合每个人。对于政策制作，这种类比也是如此。政策制定者不能进行单一的政策来解决所有地区的所有问题，因为每个地区都有自己的独特问题。在另一个极端，政策制定者由于资源限制而无法为每个小村庄作出政策。如果我们能找到一组最大的地区，使得这套中每个区域的人口有常见问题，那会更好，我们可以为他们制作一个单一的政策吗？在这项工作中，我们提出了一种使用回归分析和最小描述长度（MDL）的框架来查找具有共同指标的一组最大区域，可用于有效地预测家庭收入。鉴于一系列家庭功能，以及代表行政区划的多分辨率分区，我们的框架报告了一个集的C *最大细分的集合，具有普通的人口收入预测预测模型。我们正规化找到C *的问题，并提出了一种可以正确找到C *的算法。我们使用仿真数据集以及泰国人口家庭信息的真实数据集，以展示我们的框架性能和应用。结果表明，我们的框架性能优于基线方法。此外，我们证明了我们的方法的结果可用于找到泰国许多地区的收入预测指标。通过策略调整这些指示值，我们希望这些地区的人们获得更多收入。因此，政策制定者将能够在我们的结果中使用这些指标作为解决低收入问题的指导方针进行政策。我们的框架可用于支持政策制定者制定关于任何其他依赖变量的政策，以便打击贫困和其他问题。我们提供R包，MRREG，它是我们在R语言中实施我们的框架。 MRREG套餐为任何有兴趣分析多分辨率群体数据的线性回归的人提供文件。

著录项

来源
《ACM transactions on knowledge discovery from data》 |2021年第2期|15.1-15.30|共30页
作者
Amornbunchornvej Chainarong; Surasvadi Navaporn; Plangprasopchok Anon; Thajchayapong Suttipong;
展开▼
作者单位

Thailands Natl Elect & Comp Technol Ctr NECTEC 112 Phahonyothin Rd Khlong Luang Dist 12120 Pathum Thani Thailand;

Thailands Natl Elect & Comp Technol Ctr NECTEC 112 Phahonyothin Rd Khlong Luang Dist 12120 Pathum Thani Thailand;

Thailands Natl Elect & Comp Technol Ctr NECTEC 112 Phahonyothin Rd Khlong Luang Dist 12120 Pathum Thani Thailand;

Thailands Natl Elect & Comp Technol Ctr NECTEC 112 Phahonyothin Rd Khlong Luang Dist 12120 Pathum Thani Thailand;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Multi-resolution data; regression analysis; minimum description length; population data; model selection;

机译：多分辨率数据;回归分析;最小描述长度;人口数据;模型选择;

相似文献

外文文献
中文文献
专利

1. MINIMUM DESCRIPTION LENGTH PRINCIPLE FOR LINEAR MIXED EFFECTS MODELS [J] . Li Li, Fang Yao, Radu V. Craiu, Statistica Sinica . 2014,第3期

机译：线性混合效应模型的最小描述长度原理
2. Clustering of a set of identified points on images of dynamic scenes, based on the principle of minimum description length [J] . Peterson M.V. Journal of optical technology . 2010,第11期

机译：基于最小描述长度的原理，对动态场景图像上的一组已识别点进行聚类
3. A novel approach for modeling malaria incidence using complex categorical household data: The minimum message length (MML) method applied to Indonesian data [J] . Gerhard Visser, Pat Dale, David Dowe, Computational Ecology and Software . 2012,第3期

机译：一种使用复杂的分类家庭数据对疟疾发病率建模的新颖方法：应用于印度尼西亚数据的最小消息长度（MML）方法
4. Discovering nonlinear-integral networks from databases using evolutionary computation and minimum description length principle [C] . Leung, K.S., Wong, . 1998

机译：使用进化计算和最小描述长度原理从数据库中发现非线性积分网络
5. Applications of geometric complexity and the minimum description length principle in mathematical modeling of cognition. [D] . Zhang, Shaobo. 1999

机译：几何复杂度和最小描述长度原理在认知数学建模中的应用。
6. Newborn length predicts early infant linear growth retardation and disproportionately high weight gain in a low-income population [O] . S Clark Berngard, Jennifer Bishop Berngard, Nancy F Krebs, -1

机译：新生儿的身高预示着低收入人群的早期婴儿线性生长迟缓和体重增加不成比例地增加
7. Minimum Description Length Principle for Linear Mixed Effects Models [O] . Li Li, Fang Yao, Radu V. Craiu, 2013

机译：线性混合效应模型的最小描述长度原理

Identifying Linear Models in Multi-Resolution Population Data Using Minimum Description Length Principle to Predict Household Income

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅