Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits

Jouni Pohjalainen; Okko Raesaenen; Serdar Kadioglu

首页> 外文期刊>Computer speech and language >Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits

【24h】

Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits

机译：说话人喜好，清晰度和人格特质的高维分类中的特征选择方法及其组合

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This study focuses on feature selection in paralinguistic analysis and presents recently developed supervised and unsupervised methods for feature subset selection and feature ranking. Using the standard k-nearest-neighbors (kNN) rule as the classification algorithm, the feature selection methods are evaluated individually and in different combinations in seven paralinguistic speaker trait classification tasks. In each analyzed data set, the overall number of features highly exceeds the number of data points available for training and evaluation, making a well-generalizing feature selection process extremely difficult. The performance of feature sets on the feature selection data is observed to be a poor indicator of their performance on unseen data. The studied feature selection methods clearly outperform a standard greedy hill-climbing selection algorithm by being more robust against overfitting. When the selection methods are suitably combined with each other, the performance in the classification task can be further improved. In general, it is shown that the use of automatic feature selection in paralinguistic analysis can be used to reduce the overall number of features to a fraction of the original feature set size while still achieving a comparable or even better performance than baseline support vector machine or random forest classifiers using the full feature set. The most typically selected features for recognition of speaker likability, intelligibility and five personality traits are also reported.

机译：这项研究侧重于副语言分析中的特征选择，并提出了最近开发的用于特征子集选择和特征排名的有监督和无监督方法。使用标准的k最近邻（kNN）规则作为分类算法，在七个副语言说话者特征分类任务中分别对特征选择方法进行了评估，并以不同的组合进行了评估。在每个分析的数据集中，特征的总数大大超过了可用于训练和评估的数据点的数目，这使得很难很好地概括特征选择过程。观察到特征集在特征选择数据上的性能不能很好地指示其在看不见的数据上的性能。所研究的特征选择方法具有更强的抗过度拟合能力，明显优于标准的贪婪爬山选择算法。当选择方法彼此适当地组合时，可以进一步提高分类任务的性能。总的来说，表明在语言分析中使用自动特征选择可将特征总数减少到原始特征集大小的一小部分，同时仍可实现比基线支持向量机或同等甚至更好的性能。使用完整功能集的随机森林分类器。还报告了用于识别说话者的喜好，清晰度和五个人格特质的最典型选择的功能。

著录项

来源
《Computer speech and language》 |2015年第1期|145-171|共27页
作者
Jouni Pohjalainen; Okko Raesaenen; Serdar Kadioglu;
展开▼
作者单位

Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland;

Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland;

Oracle America Inc., Burlington, MA 01803, USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature selection; Pattern recognition; Machine learning; Computational paralinguistics;

机译：功能选择;模式识别;机器学习;计算副语言学;

相似文献

外文文献
中文文献
专利

1. Benchmark for filter methods for feature selection in high-dimensional classification data [J] . Bommert Andrea, Sun Xudong, Bischl Bernd, Computational statistics & data analysis . 2020,第1期

机译：用于在高维分类数据中的特征选择的过滤方法的基准测试
2. A Two-Stage High-Dimensional Feature Selection Method for Pulmonary Tumor Classification in CT [J] . Wang Jinke, Zhao Congcong, Shi Changfa, Journal of Medical Imaging and Health Informatics . 2019,第7期

机译：CT中肺肿瘤分类的两级高尺寸特征选择方法
3. A fast separability-based feature-selection method for high-dimensional remotely sensed image classification [J] . Guo BF, Damper RI, Gunn SR, Pattern Recognition: The Journal of the Pattern Recognition Society . 2008,第5期

机译：基于快速可分离性的高维遥感图像分类特征选择方法
4. Impact of Feature Extraction and Feature Selection on Indonesian Personality Trait Classification [C] . Ahmad Fikri Iskandar, Ema Utami, Agung Budi Prasetio International Conference on Information and Communications Technology . 2020

机译：特征提取与特征选择对印尼人格特质分类的影响
5. Feature Selection and Classification for High-Dimensional Biological Data Under Cross-Validation Framework [D] . Zhong, Yi. 2018

机译：交叉验证框架下高维生物数据的特征选择与分类
6. A combinational feature selection and ensemble neural network method for classification of gene expression data [O] . Bing Liu, Qinghua Cui, Tianzi Jiang, 2004

机译：基因表达数据分类的组合特征选择和集成神经网络方法
7. A fast separabilitybased feature selection method for high-dimensional remotely-sensed image classification [O] . Baofeng Guo, R. I. Damper, Steve R. Gunn, 2008

机译：一种基于快速可分离性的高维遥感图像分类特征选择方法

Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅