Dictation of Japanese Speech Based on Kana and Kanji Character String

Akinori Ito; Hiroaki Kinno; Masaharu Katoh; Tetsuo Kosaka; Masaki Kohda

首页> 外文期刊>International journal of computer processing of languages >Dictation of Japanese Speech Based on Kana and Kanji Character String

【24h】

Dictation of Japanese Speech Based on Kana and Kanji Character String

机译：基于假名和汉字字符串的日语语音听写

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, character-based Japanese dictation method is proposed. This method is based on the kana and kanji string language model proposed by Ito et al. First, sentences in the training corpus are split into character-based units (CBUs). Then strings of CBUs (CBUSes) are chosen from the CBU corpus based on a statistical criterion. We examined three criteria for the CBUS selection. They are the frequency-based selection, the mutual-information based selection and their combination. From the experimental results, it was found that the combined method gave the best result (7.19% and 8.75% CBU error rates for the 20k and the 60k word vocabulary conditions, respectively) which was better than the ordinary word-based method (7.61% and 9.15% CBU error rates for the 20k and the 60k word vocabulary conditions, respectively).rnIn addition, we carried out a recognition experiment for the Corpus of Spontaneous Japanese to confirm that the proposed method is effective for not only the read speech but also for spontaneous speech. As a result, we obtained the best result (29.82%) using the frequency-based method, which is better than the word-based recognition result (32.80%).

机译：本文提出了一种基于字符的日语听写方法。该方法基于Ito等人提出的假名和汉字字符串语言模型。首先，训练语料库中的句子分为基于字符的单元（CBU）。然后，根据统计标准从CBU语料库中选择CBU（CBUS）字符串。我们检查了选择CBUS的三个标准。它们是基于频率的选择，基于互信息的选择及其组合。从实验结果发现，组合方法提供了最佳结果（在20k和60k单词词汇条件下，CBU错误率分别为7.19％和8.75％），优于基于普通单词的方法（7.61％）在20k和60k单词条件下的CBU错误率分别为9.15％）。此外，我们对自发日语语料库进行了识别实验，以确认该方法不仅对阅读语音有效，而且对英语发音也有效。自发的演讲。结果，我们使用基于频率的方法获得了最佳结果（29.82％），优于基于单词的识别结果（32.80％）。

著录项

来源
《International journal of computer processing of languages》 |2009年第1期|75-98|共24页
作者
Akinori Ito; Hiroaki Kinno; Masaharu Katoh; Tetsuo Kosaka; Masaki Kohda;
展开▼
作者单位

Graduate School of Engineering, Tohoku University, Sendai, 980-8579 Japan;

Hitachi Software Engineering Co., Ltd., Tokyo, Japan;

Graduate School of Science and Engineering, Yamagata University, Yonezawa, 992-8510 Japan;

Graduate School of Science and Engineering, Yamagata University, Yonezawa, 992-8510 Japan;

Graduate School of Science and Engineering, Yamagata University, Yonezawa, 992-8510 Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Kana character; kanji character; language modeling; speech recognition;

机译：假名字符;汉字字符语言建模;语音识别;

相似文献

外文文献
中文文献
专利

1. Recognition of Japanese Phonographic Kana (Hiragana) and Logographic Kanji Characters by Passive Finger Tracing [J] . Hikari Yamashita Psychology . 2014,第3期

机译：通过被动手指追踪识别日语语音假名（平假名）和逻辑汉字字符
2. A study on language model based on kana and kanji string [J] . Hiroaki Kinno, Masaharu Katoh, Tetsuo Kosaka, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第528期

机译：基于假名和汉字字符串的语言模型研究
3. A study on language model based on kana and kanji string [J] . Hiroaki Kinno, Masaharu Katoh, Tetsuo Kosaka, 電子情報通信学会技術研究報告. 音声. Speech . 2002,第530期

机译：基于假名和汉字字符串的语言模型研究
4. A Character-based Approach to Distributional Semantic Models: Exploiting Kanji Characters for Constructing Japanese Word Vectors [C] . Akira Utsumi 9th International conference on language resources and evaluation . 2014

机译：基于字符的分布语义模型方法：利用汉字字符构建日语单词向量
5. Enhancing the effectiveness of Kanji learning for the L1 Chinese-background students through a Sino-Japanese phonological correspondence-based strategy. [D] . Shih, Lieh-Ting. 2008

机译：通过基于中日音韵对应的策略，提高汉族学习水平对母语为L1的中国背景学生的有效性。
6. Kanji and Kana agraphia in mild cognitive impairment and dementia: Atrans-cultural comparison of elderly Japanese subjects living in Japan andBrazil [O] . Kyoko Akanuma, Kenichi Meguro, Mitsue Meguro, 2010

机译：汉字和假名笔迹在轻度认知障碍和痴呆症中的作用：A在日本和日本生活的日本老年人的跨文化比较巴西
7. Recognition of Japanese Phonographic Kana (Hiragana) and Logographic Kanji Characters by Passive Finger Tracing [O] . Hikari Yamashita 2016

机译：用被动手指追踪识别日语唱法假名（平假名）和汉字汉字字符

Dictation of Japanese Speech Based on Kana and Kanji Character String

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅