LEGION: Visually compare modeling techniques for regression

机译：军团：视觉比较回归的建模技术

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

People construct machine learning (ML) models for various use cases in varied domains such as in healthcare, finance, public-policy, etc. In doing so they aim to improve a models’ performance by adopting various strategies, such as changing input data (data augmentation), tuning model hyperparameters, performing feature engineering that includes feature extraction, feature augmentation or feature transformation. However, how would users know which of these model construction strategies to adopt for their problem? Following any or all of these approaches allows the construction of a gigantic set of models, from which users may select model(s) suited to their data analytic task. This problem of model selection is non-trivial because in real-world use cases many of the best performing models (in relation to a specified metric) may appear to serve users’ goal but often exhibits nuances and tradeoffs (e.g, may weight features differently, varying compute times to train, or may predict relevant data instances differently etc.). This paper aims to solve the problem of how to construct models and how to select a preferred modeling strategy by allowing users to compare the differences and similarities between multiple regression models, and then learn not only about the model but also about their data. This learning further empowers them to select model(s) that more precisely suit their analysis goals. We present LEGION, a visual analytic tool that helps users to compare and select regression models constructed either by tuning their hyperparameters or by feature engineering. We also present two use cases on real world datasets validating the utility and effectiveness of our tool.

机译：人们构建机器学习（ML）模型在各种域中的各种用例，如医疗保健，财务，公共政策等，所以他们旨在通过采用各种策略来提高模型的性能，例如改变输入数据（数据增强）调整模型超参数，执行包含功能提取的功能工程，功能增强或功能转换。但是，用户如何知道哪些模型建设策略为其问题采用？以下任何或所有这些方法都允许构建巨大的模型，用户可以从中选择适合其数据分析任务的模型。模型选择的问题是非微不足道的，因为在真实的使用情况下，许多最好的执行模型（与指定的度量相关）可能似乎为用户的目标提供服务，但通常呈现细微差别和权衡（例如，可以不同的重量培训的计算时间，或者可以以不同方式预测相关数据实例等。本文旨在解决如何构建模型以及如何通过允许用户比较多元回归模型之间的差异和相似性来选择优选的建模策略的问题，然后不仅可以了解模型，而且还了解其数据。这一学习进一步赋予他们选择更精确适合分析目标的模型。我们提出了一条可视化分析工具，它可以帮助用户通过调整其超参数或通过特征工程来进行比较和选择构建的回归模型。我们还在真实世界数据集上展示了两种用例，验证了我们工具的实用性和有效性。

著录项

来源
《IEEE Conference on Visualization in Data Science》|2020年|12-21|共10页
会议地点
作者
Subhajit Das; Alex Endert;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Analytical models; Computational modeling; Visual analytics; Tools; Feature extraction; Data models; Tuning;

机译：分析模型;计算建模;视觉分析;工具;特征提取;数据模型;调整;

相似文献

外文文献
中文文献
专利

1. Comparing neural networks, linear and nonlinear regression techniques to model penetration resistance. [J] . Bayat H., Neyshabouri MR, Hajabbasi MA, Turkish Journal of Agriculture & Forestry . 2008,第5期

机译：比较神经网络，线性和非线性回归技术以模拟渗透阻力。
2. Comparing Neural Networks, Linear and Nonlinear Regression Techniques to Model Penetration Resistance [J] . HOSSEIN BAYAT, MOHAMMAD REZA NEYSHABURI, MOHAMMAD ALI HAJABBASI, Turkish Journal of Agriculture & Forestry . 2008,第5期

机译：比较神经网络，线性和非线性回归技术以模拟渗透阻力
3. The use of funnel plots with regression as a tool to visually compare HIV treatment outcomes between centres adjusting for patient characteristics and size: a UK Collaborative HIV Cohort study [J] . Gompels M, Michael S, Jose S, HIV medicine . 2018,第6期

机译：使用漏斗情节与回归作为在视觉上比较患者特征和尺寸的中心之间的艾滋病治疗结果的工具：英国合作艾滋病毒队列研究
4. Comparing Efficiency of Software Fault Prediction Models Developed Through Binary and Multinomial Logistic Regression Techniques [C] . Dipti Kumari, Kumar Rajnish International Conference on Information Systems Design and Intelligent Applications . 2015

机译：通过二元和多项式逻辑回归技术开发的软件故障预测模型的比较效率
5. Modeling large-scale cross effect in co-purchase incidence: Comparing artificial neural network techniques and multivariate probit modeling. [D] . Yang, Zhiguo. 2015

机译：共同购买发生率中的大规模交叉效应建模：比较人工神经网络技术和多元概率模型。
6. Unweighted regression models perform better than weighted regression techniques for respondent-driven sampling data: results from a simulation study [O] . Lisa Avery, Nooshin Rotondi, Constance McKnight, 2019

机译：对于受访者驱动的采样数据非加权回归模型的性能优于加权回归技术：模拟研究的结果
7. Comparing pseudo-absences generation techniques in Boosted Regression Trees models for conservation purposes: A case study on amphibians in a protected area. [O] . Francesco Cerasoli, Mattia Iannella, Paola D'Alessandro, 2017

机译：比较用于保护目的的Boosted回归树模型中的伪缺失生成技术：一个关于保护区内两栖动物的案例研究。

LEGION: Visually compare modeling techniques for regression

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅