Large scale data based audio scene classification

E. Sophiya; S. Jothilakshmi

首页> 外文期刊>International journal of speech technology >Large scale data based audio scene classification

【24h】

Large scale data based audio scene classification

机译：基于大规模数据的音频场景分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Artificial Intelligence and Machine learning has been used by many research groups for processing large scale data known as big data. Machine learning techniques to handle large scale complex datasets are expensive to process computation. Apache Spark framework called spark MLlib is becoming a popular platform for handling big data analysis and it is used for many machine learning problems such as classification, regression and clustering. In this work, Apache Spark and the advanced machine learning architecture of a Deep Multilayer Perceptron (MLP), is proposed for Audio Scene Classification. Log Mel band features are used to represent the characteristics of the input audio scenes. The parameters of the DNN are set according to the DNN baseline of DCASE 2017 challenge. The system is evaluated with TUT dataset (2017) and the result is compared with the baseline provided.

机译：许多研究小组已使用人工智能和机器学习来处理称为大数据的大规模数据。用于处理大规模复杂数据集的机器学习技术对于处理计算而言非常昂贵。 Apache Spark框架称为spark MLlib正在成为处理大数据分析的流行平台，并且用于许多机器学习问题，例如分类，回归和聚类。在这项工作中，提出了Apache Spark和深度多层感知器（MLP）的高级机器学习架构，用于音频场景分类。 Log Mel波段特征用于表示输入音频场景的特征。 DNN的参数是根据DCASE 2017挑战的DNN基线设置的。使用TUT数据集（2017）对系统进行评估，并将结果与提供的基线进行比较。

著录项

来源
《International journal of speech technology》 |2018年第4期|825-836|共12页
作者
E. Sophiya; S. Jothilakshmi;
展开▼
作者单位

Department of Information Technology, Annamalai University;

Department of Computer Science and Engineering, Annamalai University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Big data analytics; Machine learning; Apache spark MLlib; Audio processing; Audio scene analysis; Audio scene classification; Deep learning; Audio features;

机译：大数据分析;机器学习;Apache Spark MLlib;音频处理;音频场景分析;音频场景分类;深度学习;音频功能;

相似文献

外文文献
中文文献
专利

1. A note on classification of MPEG audio data for scene cut detection [J] . Naoki Nitanda, Miki Haseyama, Hideo Kitajima 電子情報通信学会技術研究報告. マルチメディア·仮想環境基礎 . 2002,第219期

机译：有关用于场景切换检测的MPEG音频数据分类的注释
2. A note on classification of MPEG audio data for scene cut detection [J] . Naoki Nitanda, Miki Haseyama, Hideo Kitajima 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2002,第217期

机译：有关用于场景切换检测的MPEG音频数据分类的注释
3. A note on classification of MPEG audio data for scene cut detection [J] . Naoki Nitanda, Miki Haseyama, Hideo Kitajima 電子情報通信学会技術研究報告. 画像工学. Image Engineering . 2002,第215期

机译：有关用于场景切换检测的MPEG音频数据分类的注释
4. Sample Dropout for Audio Scene Classification Using Multi-scale Dense Connected Convolutional Neural Network [C] . Dawei Feng, Kele Xu, Haibo Mi, Pacific Rim knowledge acquisition workshop . 2018

机译：使用多尺度密集连接卷积神经网络进行音频场景分类的样本丢失
5. A neural model of scene understanding: Multiple-scale spatial and feature-based attention in scene search, learning, and recognition. [D] . Huang, Tsung-Ren. 2010

机译：场景理解的神经模型：场景搜索，学习和识别中多尺度基于空间和基于特征的注意力。
6. Multi-Scale Spatial Concatenations of Local Features in Natural Scenes and Scene Classification [O] . Xiaoyuan Zhu, Zhiyong Yang -1

机译：自然场景和场景分类中局部特征的多尺度空间级联
7. Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification [O] . Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, 2019

机译：利用并行录音以强制基于CNN的声学场景分类中的设备不变性

Large scale data based audio scene classification

摘要

著录项

相似文献

相关主题

期刊订阅