首页> 美国政府科技报告 >Harvesting Data from Advanced Technologies. Final Report 2012-2013.

【24h】

Harvesting Data from Advanced Technologies. Final Report 2012-2013.

机译：从先进技术中收集数据。 2012-2013最终报告。

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data streams are emerging everywhere such as Web logs, Web page click streams, sensor data streams, and credit card transaction flows. Different from traditional data sets, data streams are sequentially generated and arrive one by one rather than being available for random access before learning begins, and they are potentially huge or even infinite that it is impractical to store the whole data. To study learning from data streams, we target online learning, which generates a bestso far model on the fly by sequentially feeding in the newly arrived data, updates the model as needed, and then applies the learned model for accurate real-time prediction or classification in real-world applications. Several challenges arise from this scenario: first, data is not available for random access or even multiple access; second, data imbalance is a common situation; third, the performance of the model should be reasonable even when the amount of data is limited; fourth, the model should be updated easily but not frequently; and finally, the model should always be ready for prediction and classification. To meet these challenges, we investigate streaming feature selection by taking advantage of mutual information and group structures among candidate features. Streaming feature selection reduces the number of features by removing noisy, irrelevant, or redundant features and selecting relevant features on the fly, and brings about palpable effects for applications: speeding up the learning process, improving learning accuracy, enhancing generalization capability, and improving model interpretation. Compared with traditional feature selection, which can only handle pre-given data sets without considering the potential group structures among candidate features, streaming feature selection is able to handle streaming data and select meaningful and valuable feature sets with or without group structures on the fly. In this research, we propose (1) a novel streaming feature selection algorithm (GFSSF, Group Feature Selection with Streaming Features) by exploring mutual information and group structures among candidate features for both group and individual levels of feature selection from streaming data, (2) a lazy online prediction model with data fusion, feature selection and weighting technologies for real-time traffic prediction from heterogeneous sensor data streams, (3) a lazy online learning model (LB, Live Bayes) with dynamic resampling technology to learn from imbalanced embedded mobile sensor data streams for real-time activity recognition and user recognition, and (4) a lazy update online learning model (CMLR, Cost-sensitive Multinomial Logistic Regression) with streaming feature selection for accurate real-time classification from imbalanced and small sensor data streams. Finally, by integrating traffic flow theory, advanced sensors, data gathering, data fusion, feature selection and weighting, online learning and visualization technologies to estimate and visualize the current and future traffic, a real-time transportation prediction system named VTraffic is built for the Vermont Agency of Transportation.

著录项

作者
Wu, X.;
展开▼
作者单位

展开▼
年度 2014
页码 1-41
总页数 41
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Data Streaming; Data noise; Predictive modeling; Data sets;

机译：数据流;数据噪声;预测建模;数据集;

相似文献

外文文献
中文文献
专利

1. Advanced data capture in the assisted medical home: a model for distributed and multimedia technologies. [J] . Churchill R, Lorence D, Richards M Journal of medical systems . 2010,第4期

机译：辅助医疗之家中的高级数据捕获：分布式和多媒体技术的模型。
2. DHS: Regulatory agenda • DOL: Final rule on overtime regulations; tip pooling rule; fiduciary rule; small business health plans; disability plan regulations; miscellaneous • DEPARTMENT OF TRANSPORTATION • EEOC: EEO-1 reports; wellness rules; miscellaneous • NATIONAL LABOR RELATIONS BOARD: Election rule • OSHA: Final rule to modernize injury data collection; increase civil penalties [J] . Howard M. Kastrinsky Employment relations today . 2018,第4期

机译：DHS：监管议程•DOL：关于加班规则的最终规则；
3. DOL: Final Rule on Overtime Regulations; Fair Pay and Safe Workplaces Executive Order; Fiduciary Rule; Miscellaneous • HOC: EEO-I Reports; Religious Bias • IRS: Wellness Rewards; Marijuana • NLRB • OSHA: Final Rule to Modernize Injury Data Collection [J] . Howard M. Kastrinsky Employment relations today . 2017,第4期

机译：DOL：《加班规则最终规则》；公平薪酬和安全工作场所行政命令；信托规则；其他•HOC：EEO-I报告；宗教偏见•IRS：健康奖励；大麻•NLRB•OSHA：使伤害数据收集现代化的最终规则
4. AGDAT: SPATIAL APPLICATIONS TO IMPROVE HARVEST MANAGEMENT, DATA RECORDING AND REPORTING, AND DATA EXCHANGE BETWEEN ORGANISATIONS [C] . ROBERT CROSSLEY, JOHN MARKLEY Conference of the Australian Society of Sugar Cane Technologists . 2011

机译：AGDAT：用于改善组织之间的收割管理，数据记录和报告以及数据交换的空间应用程序
5. Turning data into information: Assessing and reporting GIS metadata integrity using integrated computing technologies. [D] . Mulrooney, Timothy J. 2009

机译：将数据转化为信息：使用集成计算技术评估和报告GIS元数据的完整性。
6. Interim reports for data monitoring committee review vs final reports for regulatory filing [O] . KyungMann Kim 2011

机译：数据监控委员会审核的中期报告与法规备案的最终报告
7. Development of information and market creation mechanisms for promoting advanced energy efficient transportation technologies. Final report to the U.S. Department of Energy [O] . DeCicco, John, Bradley, John, Richman, Nessa 2000

机译：开发信息和市场创造机制，以促进先进的节能运输技术。最终报告给美国能源部

Harvesting Data from Advanced Technologies. Final Report 2012-2013.

摘要

著录项

相似文献

相关主题

期刊订阅