Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin Using Recursive Neural Networks

Minh Nguyen; Gia H. Ngo; Nancy F. Chen

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin Using Recursive Neural Networks

【24h】

Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin Using Recursive Neural Networks

机译：分层字符嵌入：使用递归神经网络的逻辑原产语言学习语言和语义表示

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Logographs (Chinese characters) have recursive structures (i.e. hierarchies of sub-units in logographs) that contain phonological and semantic information, as developmental psychology literature suggests that native speakers leverage on the structures to learn how to read. Exploiting these structures could potentially lead to better embeddings that can benefit many downstream tasks. We propose building hierarchical logograph (character) embeddings from logograph recursive structures using treeLSTM, a recursive neural network. Using recursive neural network imposes a prior on the mapping from logographs to embeddings since the network must read in the sub-units in logographs according to the order specified by the recursive structures. Based on human behavior in language learning and reading, we hypothesize that modeling logographs’ structures using recursive neural network should be beneficial. To verify this claim, we consider two tasks (1) predicting logographs’ Cantonese pronunciation from logographic structures and (2) language modeling. Empirical results show that the proposed hierarchical embeddings outperform baseline approaches. Diagnostic analysis suggests that hierarchical embeddings constructed using treeLSTM is less sensitive to distractors, thus is more robust, especially on complex logographs.

机译：Logographes（汉字）具有递归结构（即Logographes中的子单元的层次结构），其包含语音和语义信息，因为发动心理学文献表明，母语扬声器利用结构杠杆学习如何阅读。利用这些结构可能导致更好的嵌入，可以使许多下游任务有益。我们建议使用递归神经网络的登录递归结构构建分层登录（字符）嵌入式。使用递归神经网络在映射上施加到映射到嵌入的映射，因为网络必须根据递归结构指定的顺序在Logography中的子单元中读取。基于语言学习和阅读中的人类行为，我们假设使用递归神经网络建模的逻辑结构的结构应该是有益的。为了验证这一索赔，我们考虑了两个任务（1）预测登录的粤语来自逻辑结构的发音和（2）语言建模。经验结果表明，建议的分层嵌入式优于基线方法。诊断分析表明，使用触发器构建的分层嵌入对分散的组织不太敏感，因此更强大，尤其是在复杂的上记录中。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2020年第2020期|461-473|共13页
作者
Minh Nguyen; Gia H. Ngo; Nancy F. Chen;
展开▼
作者单位

Institute for Infocomm Research Agency for Science Technology and Research Singapore;

Institute for Infocomm Research Agency for Science Technology and Research Singapore;

Institute for Infocomm Research Agency for Science Technology and Research Singapore;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Task analysis; Training; Semantics; Neural networks; Predictive models; Binary trees; Data models;

机译：任务分析;培训;语义;神经网络;预测模型;二元树;数据模型;

相似文献

外文文献
中文文献
专利

1. Using Sub-character Level Information for Neural Machine Translation of Logographic Languages [J] . Zhang Longtu, Komachi Mamoru ACM transactions on Asian and low-resource language information processing . 2021,第2期

机译：利用子字符级别信息来逻辑语言翻译
2. Neural correlates of semantic and phonological processing revealed by functional connectivity patterns in the language network [J] . Mengxia Yu, Zhe Wu, Mengkai Luan, Neuropsychologia . 2018,第期

机译：语言网络中功能连通模式揭示的语义和语音处理的神经相关性
3. Evaluation of hierarchical structured representations for QSPR studies of small molecules and polymers by recursive neural networks [J] . Bertinetto C, Duce C, Micheli A, Journal of molecular graphics & modelling . 2009,第7期

机译：通过递归神经网络评估QSPR研究小分子和聚合物的层次结构表示形式
4. Learning Hierarchical Representations for Face Recognition using Deep Belief Network Embedded with Softmax Regress and Multiple Neural Networks [C] . Hai-jun Zhang, Nan-feng Xiao International Workshop on Materials Engineering and Computer Sciences . 2015

机译：使用SoftMax转运和多个神经网络的深度信仰网络学习面部识别的分层表示
5. The role of phonological awareness: Phonological awareness in alphabetic and logographic languages for Taiwanese students. [D] . Chen, Tzu Wen. 2009

机译：语音意识的作用：台湾学生在字母和逻辑语言中的语音意识。
6. Attentional Blink Is Hierarchically Modulated by Phonological Morphological Semantic and Lexical Connections between Two Chinese Characters [O] . Hong-Wen Cao, Kai-Bin Jin, Chao-Yi Li, -1

机译：注意眨眼是由两个汉字之间的音韵形态语义和词汇联系进行分层调节的
7. Learning Hierarchical Representations for Face Recognition using Deep Belief Network Embedded with Softmax Regress and Multiple Neural Networks [O] . Haijun Zhang, Nanfeng Xiao 2015

机译：使用SoftMax转运和多个神经网络的深度信仰网络学习面部识别的分层表示

Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin Using Recursive Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅