【24h】

myDJ: Recommending Karaoke Songs From One's Own Voice

机译:MyDJ:从一个人的声音推荐卡拉OK歌曲

获取原文

摘要

Singing is a worldwide activity across all walks of life. Many of us have had Karaoke defeats of singing a beloved song but making the others plug their ears. A major reason for such failure is that the songs one loves might not fit his/her personal laryngeal anatomy which determines the capacity of one's voice source. myDJ is a karaoke recommendation system which recommends proper songs based on the pitch and intensity that one can nicely produce. Unlike most state-of-the-art recommendation system-s, which rely on the similarity defined for the contents of songs, the listening habits of users, or the listening patterns, myDJ is the first prototype which recommends songs according to one's physical phonation area. The key challenge in this approach is to connect the singer's phonation model, which is typically described as a Vocal Range Profile (VRP)[1] in clinical quantitative voice assessments, to the music database, so that a given profile retrieves suitable songs. Unfortunately, a VRP is not enough to retrieve songs, as it only describes the minimum and maximum sound pressure levels (dB) across the singer's vocal range (Hz) without evaluating the voice quality. Our work focuses on techniques to build such connection for retrieval. myDJ consists of four modules: (1) The singer profiler, which creates a profile for each singer via a tranditional musical test known as Messa di voce. The singer profile consists of two parts, the VR-P which is depicted as a 2D region, and the Overall Voice Quality (OVQ)[3]combining multiple voice quality features as a distribution over the 2D region defined by VRP. (2) The Midi Song Database. We consider each song in the database as a documen-t, and the notes in a song at the same pitch or intensity as a pitch term (IT) or an intensity term(IT) respectively. The duration of each IT and the TF-IDF weighted duration of each PT, which are accumulated by the note duration in a song, are the major factors affecting the fitness of a song. Thus, we define a song's profile as the combination of the TF-IDF weights, the PT and IT durations, and the pitch and intensity values of all notes in it. (3) The learning to rank module, where algorithm Listnet[2] is applied. (4) The song recommendation module, where songs are recommended using the ranking function learned by Listnet. myDj works in two phases: 1. The offline training phase: The learning algorithm uses (1) the singer profiles, and (2) the five-level fitness scored song profiles which are manually labeled by the singers for themselves, as the training data. During the training process, each singer profile is considered as a query and the song profiles as the documents. Thus, the features are extracted from all (query, document) pairs, and Gradient Descent is applied to calculate the parameters in the ranking function. 2. The online phase: First, the test subject has to take the musical test to acquire the singer profile. Next, feature extraction will be conducted for all (query, document) pairs. To expedite this process, we utilize a document index to prune documents (songs) before any feature extraction. Finally, the rank score of each song is calculated using the ranking function learnt in the first step. Figure 1 shows a screen shot of myDJ. The left part of the interface shows a singer profile, where the 2D region indicates the VRP and the color of each pixel indicates the OVQ value (Red for good and blue for poor). The top right screen shows a ranked list of songs being recommended for this subject.
机译:唱歌是一个世界性的活动跨越各行各业。我们很多人都有过演唱心爱的歌曲,但让其他人将他们的耳朵卡拉OK失败。对于这样的失败的一个重要原因是,一点歌曲的爱可能不适合他/她的个人喉解剖它决定一个人的声音源的能力。 myDJ是卡拉OK推荐系统,其建议基于沥青和强度,人们可以很好地产生于适当的歌曲。与大多数国家的最先进的推荐系统-S,这依赖于歌曲的内容来定义的相似性,用户的听音习惯,或侦听图案,myDJ是根据人的身体发声其中建议歌曲的第一个原型区域。这种方法的关键挑战是连接歌手的发声模式,这通常被描述为一个音域档案(VRP)[1]在临床定量语音评估,音乐数据库,以便给定的个人资料检索适合的歌曲。不幸的是,VRP是不够的,检索歌曲,因为它只是描述了整个演唱者的音域(赫兹)的最小和最大声压级(分贝),而评估语音质量。我们的工作重点放在技术来构建检索这样的连接。 myDJ由四个模块组成:(1)所述的歌手分析器,其通过被称为MESSA二几句一个音乐场景下的传统测试会为每个歌手的轮廓。歌手轮廓由两个部分组成,它被描述为一个二维区域中的VRP,和总体语音质量(OVQ)[3]组合多个语音质量设有作为分配在由VRP定义的2D区域。 (2)将MIDI乐曲数据库。我们认为在数据库作为documen-T的每首歌曲,并在同一音调或强度的间距术语一首歌的音符(IT)或分别强度项(IT)。每一个IT和各PT,这是由在一首歌曲的音符持续时间积累的TF-IDF加权持续时间的持续时间,是影响​​一首歌的健身的主要因素。因此,我们定义一首歌的个人资料作为TF-IDF权重,在PT和IT持续时间的组合,并在所有音符的音高和强度值。 (3)学习到秩模块,其中算法Listnet施加[2]。 (4)歌曲推荐模块,其中的歌曲使用分级功能通过Listnet学会推荐。 myDj工作在两个阶段:1.离线训练阶段:学习算法使用(1)的歌手型材,和(2)的五个级的健身得分被手动地歌手标记为自己的歌曲型材,作为训练数据。在训练过程中,每个歌手的个人资料被认为是查询和歌曲谱的文档。因此,这些特征从所有(查询,文档)对萃取,梯度下降被施加到计算中的排名函数的参数。 2.在线阶段:首先,受测者必须采取音乐测试获取歌手轮廓。接下来,特征提取将为所有(查询,文档)对进行。为了加快这一进程,我们利用文档索引的任何特征提取前修剪文件(歌曲)。最后,每首歌曲的等级分数是使用排序功能的第一步学会计算。图1示出的myDJ的屏幕截图。的界面示出了歌手简档,其中,所述2D区域表示VRP,并且每个像素的颜色的左侧部分表示OVQ值(红色为良好和蓝色差)。右上方的屏幕显示在被推荐作为这一主题的歌曲排名列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号