首页> 外文期刊>Marketing Science >Model Selection Using Database Characteristics: Developing a Classification Tree for Longitudinal Incidence Data
【24h】

Model Selection Using Database Characteristics: Developing a Classification Tree for Longitudinal Incidence Data

机译:使用数据库特征的模型选择:为纵向入射数据开发分类树

获取原文
获取原文并翻译 | 示例
       

摘要

When managers and researchers encounter a data set, they typically ask two key questions: (1) Which model (from a candidate set) should I use? And (2) if I use a particular model, when is it going to likely work well for my business goal? This research addresses those two questions and provides a rule, i.e., a decision tree, for data analysts to portend the "winning model" before having to fit any of them for longitudinal incidence data. We characterize data sets based on managerially relevant (and easy-to-compute) summary statistics, and we use classification techniques from machine learning to provide a decision tree that recommends when to use which model. By doing the "legwork" of obtaining this decision tree for model selection, we provide a time-saving tool to analysts. We illustrate this method for a common marketing problem (i.e., forecasting repeat purchasing incidence for a cohort of new customers) and demonstrate the method's ability to discriminate among an integrated family of a hidden Markov model (HMM) and its constrained variants. We observe a strong ability for data set characteristics to guide the choice of the most appropriate model, and we observe that some model features (e.g., the "back-and-forth" migration between latent states) are more important to accommodate than are others (e.g., the inclusion of an "off" state with no activity). We also demonstrate the method's broad potential by providing a general "recipe" for researchers to replicate this kind of model classification task in other managerial contexts (outside of repeat purchasing incidence data and the HMM framework).
机译:当管理人员和研究人员遇到数据集时,他们通常会问两个关键问题:(1)我应该使用哪种模型(来自候选集)? (2)如果我使用特定的模型,什么时候才可以很好地实现我的业务目标?这项研究解决了这两个问题,并提供了一个规则,即决策树,供数据分析人员在必须将其中任何一个用于纵向入射数据之前预示“获胜模型”。我们基于管理上相关(且易于计算)的摘要统计数据来表征数据集,并使用来自机器学习的分类技术来提供建议何时使用哪种模型的决策树。通过完成获取决策树以进行模型选择的“工作”,我们为分析人员提供了节省时间的工具。我们针对常见的营销问题(即预测一组新客户的重复购买发生率)说明了这种方法,并展示了该方法区分隐藏的马尔可夫模型(HMM)的集成族及其受约束的变体的能力。我们观察到强大的数据集特征能力来指导最合适的模型的选择,并且我们观察到某些模型特征(例如,潜在状态之间的“来回”迁移)比其他模型特征更重要(例如,包含没有活动的“关闭”状态)。我们还通过为研究人员提供一般的“配方”以在其他管理环境中(除了重复购买发生率数据和HMM框架之外)复制这种模型分类任务,来证明该方法的广泛潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号