首页> 外文学位 >Feature selection in large dataset processing, especially in the video domain.

【24h】

Feature selection in large dataset processing, especially in the video domain.

机译：大型数据集处理中的特征选择，尤其是在视频领域。

获取原文

获取原文并翻译 | 示例

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The rapid growth and wide applications of digital video data have led to a significant need for video classification. Unfortunately, the gap between high-level concepts and low-level features and the high time cost of video analysis are two important obstacles of efficient video data management. Feature selection, which can improve the prediction performance of the predictors, provide faster and more cost-effective predictors, and provide better understanding of the data, is introduced to address these two problems. But applying existing automatic feature selection algorithms to video data is impractical because of the unrealistic amount of computer time. So far, most feature selection technologies in video applications are based on researchers' intuition although human interaction can't satisfy the dramatic increase of video data and the multiple requirements of different users.; The first automatic feature selection algorithm we proposed is the Basic Sort-Merge Tree (BSMT), which is well-adapted to the characteristics of video data. The linear time cost of BSMT allows us the practical implementation of video frame categorization. To address the problem of sparse and noisy training data in video retrieval, we proposed the Complement Sort-Merge Tree (CSMT). CSMT detects complementary relationships shown in the outer wrapper model's results, in order to reduce the influence of more coarsely quantized prediction error. We provide empirical validation of this method by instructional video retrieval. Fast-converging Sort-Merge Tree (FSMT) speeds up BSMT further by setting up only a selected portion of the feature selection tree with two evaluation metrics, in order to satisfy the higher time cost requirement of on-line video retrieval. We demonstrate it with sports video shot classification. Multi-Level Feature selection (MLFS), based on the hierarchical structure of BSMT, permits a coarse-fine scene segmentation. The basic idea is to apportion different classification costs based on the classification difficulty of the different data. We demonstrate its improvement of the efficiency of video segment boundary detection compared with BSMT with instructional video scene segmentation. Based on the feature selection algorithms mentioned, we proposed a fast video retrieval system using different feature selection algorithms and lazy evaluation.; To show universality of our feature selection algorithms, we also provide some theoretical analysis. We simulate different feature selection algorithms on common synthetic datasets. The performance is compared from accuracy, efficiency and robustness. Finally, we propose the further work from two aspects: how to improve the feature selection algorithms and how to apply feature selection algorithms to different applications better.

机译：数字视频数据的快速增长和广泛应用导致对视频分类的巨大需求。不幸的是，高级概念和低级功能之间的鸿沟以及视频分析的高昂时间成本是有效视频数据管理的两个重要障碍。为了解决这两个问题，引入了特征选择，可以提高预测变量的预测性能，提供更快，更具成本效益的预测变量并更好地理解数据。但是，由于计算机时间不切实际，因此将现有的自动特征选择算法应用于视频数据是不切实际的。到目前为止，尽管人类交互无法满足视频数据的急剧增长和不同用户的多种需求，但视频应用中的大多数功能选择技术都是基于研究人员的直觉。我们提出的第一个自动特征选择算法是基本排序合并树（BSMT），它非常适合视频数据的特征。 BSMT的线性时间成本使我们可以实际实施视频帧分类。为了解决视频检索中训练数据稀疏和嘈杂的问题，我们提出了互补排序合并树（CSMT）。 CSMT检测外部包装模型结果中显示的互补关系，以减少更粗略量化的预测误差的影响。我们通过教学视频检索提供了该方法的经验验证。快速收敛的排序合并树（FSMT）通过仅设置具有两个评估指标的特征选择树的选定部分，进一步提高了BSMT的速度，从而满足了在线视频检索的更高时间成本要求。我们通过运动视频镜头分类来演示它。基于BSMT的层次结构的多级特征选择（MLFS）允许粗略的场景分割。基本思想是根据不同数据的分类难度分摊不同的分类成本。我们证明了与带教学视频场景分割的BSMT相比，它提高了视频片段边界检测的效率。基于上述特征选择算法，我们提出了一种使用不同特征选择算法和惰性评估的快速视频检索系统。为了展示我们的特征选择算法的普遍性，我们还提供了一些理论分析。我们在常见的合成数据集上模拟不同的特征选择算法。从准确性，效率和耐用性方面比较性能。最后，我们从两个方面提出进一步的工作：如何改进特征选择算法以及如何更好地将特征选择算法应用于不同的应用程序。

著录项

作者
Liu, Yan.;
展开▼
作者单位

Columbia University.;

展开▼
授予单位 Columbia University.;
学科 Computer Science.
学位 Ph.D.
年度 2005
页码 183 p.
总页数 183
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results [J] . Chen Chih-Wen, Tsai Yi-Hong, Chang Fang-Rong, Expert Systems . 2020,第5期

机译：Medical DataSets中的合奏功能选择：结合过滤器，包装器和嵌入式功能选择结果
2. A new feature selection method on classification of medical datasets: Kernel F-score feature selection [J] . Kemal Polat, Salih Guenes Expert systems with applications . 2009,第7期

机译：一种新的医学数据分类特征选择方法：内核F分数特征选择
3. Fast method for GA‐PLS with simultaneous feature selection and identification of optimal preprocessing technique for datasets with many observations [J] . Journal of Chemometrics . 2020,第3期

机译：GA-PLS的快速方法，具有同时特征选择和识别许多观测的数据集最优预处理技术
4. Optimizing the Feature Selection Process for Better Accuracy in Datasets with a Large Number of Features (Student Abstract) [C] . Xi Chen, Afsaneh Doryab AAAI Conference on Artificial Intelligence . 2020

机译：优化具有大量功能的数据集中更好的精度选择过程（学生摘要）
5. Parallel Feature Selection of Multiple Class Datasets Using Apache Spark [D] . Sankineni, Rishi 2017

机译：使用Apache Spark的多个类数据集的并行特征选择
6. Selection and validation of emotional videos: Dataset of professional and amateur videos that elicit basic emotions [O] . HongYi Chen, Kai Ling Chin, Chrystalle B.Y. Tan 2021

机译：情感视频的选择和验证：专业和业余视频的数据集从而引出基本情感
7. An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features [O] . Wan, Cen, Freitas, Alex A. 2017

机译：基于基因本体特征的生物信息学数据集分类的分层特征选择方法的实证评估

Feature selection in large dataset processing, especially in the video domain.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅