On the Identification of FOSD-based Non-zero Onset Speech Dataset

机译：基于FOSD的非零发作语音数据集的识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent trends in voicebot and chatbot application development have enabled utilization of speech-to-text (STT) and text-to-speech (TTS) generation techniques. In order to develop such TTS or STT engines, text and the corresponding recorded speech in an audio file used for training, validating and testing must be aligned. This is to ensure the developed engines achieve the desired conversion quality. In order to align speech and text, an audio alignment tool should be used. In such tools, often onset detection algorithms are utilized for labeling the audio file’s speech start and end times. This information is then stored together with the file’s transcript. In this work, an open nonzero onset Vietnamese speech dataset is provided. This dataset contains 348 audio files filtered from over 25,000 (approximately 30-hours) Vietnamese speech records released publicly by FPT Corporation, Vietnam in 2018. This amount of labeled data is considered to be more than sufficient for a typical onset detection algorithm researches.

机译：语音机器人和聊天机器人应用程序开发的最新趋势已使语音到文本（STT）和文本到语音（TTS）生成技术的利用成为可能。为了开发这样的TTS或STT引擎，文本和用于训练，验证和测试的音频文件中的相应录制语音必须对齐。这是为了确保开发的发动机达到所需的转换质量。为了对齐语音和文本，应使用音频对齐工具。在此类工具中，通常使用起步检测算法来标记音频文件的语音开始和结束时间。然后，此信息将与文件的成绩单一起存储。在这项工作中，提供了一个开放的非零开始越南语语音数据集。该数据集包含348个音频文件，这些音频文件是从越南FPT公司于2018年公开发布的25,000多个（约30小时）越南语音记录中过滤掉的。被标记的数据量被认为足以进行典型的发作检测算法研究。

著录项

来源
《IEEE Student Conference on Research and Development》|2020年|108-110|共3页
会议地点
作者
Duc Chung Tran; Rosdiazli Ibrahim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
non-zero; onset; detection; data; identification; retrieval; text-to-speech; Vietnamese;

机译：非零;开始;检测;数据;识别;检索;文本转语音;越南语;

相似文献

外文文献
中文文献
专利

1. Speaker Non-speech Event Recognition with Standard Speech Datasets [J] . J. Rajnoha Acta polytechnica . 2007,第4a5期

机译：具有标准语音数据集的说话人非语音事件识别
2. Speaker Non-speech Event Recognition with Standard Speech Datasets [J] . J. Rajnoha Acta Polytechnica . 2007,第4a5期

机译：具有标准语音数据集的说话人非语音事件识别
3. FEATURE SELECTION AND CLASSIFICATION OF SPEECH DATASET FOR GENDER IDENTIFICATION: A MACHINE LEARNING APPROACH [J] . RIZWAN REHMAN, KAUSTUVMONI BORDOLOI, KANKANA DUTTA, Journal of Theoretical and Applied Information Technology . 2020,第22期

机译：用于性别识别的语音数据集的特征选择和分类：机器学习方法
4. Reduction of the Onset Response in High Frequency Nerve Block with Amplitude Ramps from Non-Zero Amplitudes [C] . Niloy Bhadra, Emily L. Foldes, D. Michael Ackermann Jr., Annual International Conference of the IEEE Engineering in Medicine and Biology Society . 2009

机译：具有来自非零幅度的幅度斜坡的高频神经块中发作响应的降低
5. Improving Keywords Spotting Performance in Noise with Augmented Dataset from Vocoded Speech and Speech Denoising [D] . Li, Ruohao. 2021

机译：从声音语音和语音去噪带来的噪声中的噪声中的关键字
6. Identification of robust genetic signatures associated with lipopolysaccharide-induced acute lung injury onset and astaxanthin therapeutic effects by integrative analysis of RNA sequencing data and GEO datasets [O] . Kaimin Mao, Wei Geng, Yuhan Liao, 2020

机译：通过RNA测序数据和地理数据集的总体分析鉴定与脂多糖诱导的急性肺损伤发病和虾青素治疗效果相关的鲁棒遗传签名
7. Identification of genetic variants predictive of early onset pancreatic cancer through a population science analysis of functional genomic datasets [O] . Jinyun Chen, Xifeng Wu, Yujing Huang, 2016

机译：鉴别函数基因组数据集的人口科学分析预测早期发病胰腺癌的遗传变异性

On the Identification of FOSD-based Non-zero Onset Speech Dataset

摘要

著录项

相似文献

相关主题

期刊订阅