Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors

机译：探索语言学习者错误的无监督校正词嵌入和语音相似性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The presence of misspellings and other errors or non-standard word forms poses a considerable challenge for NLP systems. Although several supervised approaches have been proposed previously to normalize these, annotated training data is scarce for many languages. We investigate, therefore, an unsupervised method where correction candidates for Swedish language learners' errors are retrieved from word embeddings. Furthermore, we compare the usefulness of combining cosine similarity with orthographic and phonological similarity based on a neural grapheme-to-phoneme conversion system we train for this purpose. Although combinations of similarity measures have been explored for finding correction candidates, it remains unclear how these measures relate to each other and how much they contribute individually to identifying the correct alternative. We experiment with different combinations of these and find that integrating phonological information is especially useful when the majority of learner errors are related to misspellings, but less so when errors are of a variety of types including, e.g. grammatical errors.

机译：拼写错误的存在和其他错误或非标准单词表单对NLP系统构成了相当大的挑战。虽然先前提出了几种监督方法，以便正常化这些，但是稀释的培训数据对于许多语言来说是稀缺的。因此，我们调查了无监督的方法，其中瑞典语语言学习者错误的纠正候选者从Word Embeddings检索。此外，我们基于针对此目的的神经图形到音素转换系统比较与正交和音韵相似性相结合的有用性。虽然已经探索了相似性措施的组合来寻找纠正候选者，但仍然尚不清楚这些措施如何彼此相关，以及它们单独贡献以识别正确的替代方案。我们尝试这些不同的组合，并发现当大多数学习者错误与拼写错误相关时，整合语音信息尤其有用，但当错误的误差包括多种类型时，包括例如，包括例如，语法错误。

著录项

来源
《Joint SIGHUM workshop on computational linguistics for cultural heritage, social sciences, humanities and literature》|2018年|xi 168 p.|共10页
会议地点
作者
Ildiko Pilan; Elena Volodina;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Exploiting Syntactic Similarities for Preposition Error Corrections on Indonesian Sentences Written by Second Language Learner [J] . Budi Irmawati, Hiroyuki Shindo, Yuji Matsumoto Procedia Computer Science . 2016,第1期

机译：利用语法相似性对第二语言学习者写的印尼语句子进行介词错误纠正
2. English Language Learners' Nonword Repetition Performance: The Influence of Age, L2 Vocabulary Size, Length of L2 Exposure, and L1 Phonology [J] . Duncan Tamara Sorenson, Paradis Johanne Journal of speech, language, and hearing research: JSLHR . 2016,第1期

机译：英语学习者的非单词重复表现：年龄，L2词汇量，L2暴露时间和L1语音学的影响
3. Differences in phonologic and prosodic abilities in children with phonological language impairment and phonological-grammatical language impairment assessed with non-word repetition [J] . From Asa, Sundstrom Simon, Samuelsson Christina Logopedics, phoniatrics, vocology. . 2016,第1a2期

机译：非单词重复评估的语音语言障碍儿童和语音语法语言障碍儿童的语音和韵律能力差异
4. Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors [C] . Ildiko Pilan, Elena Volodina Second joint SIGHUM workshop on computational linguistics for cultural heritage, social sciences, humanities and literature . 2018

机译：探索单词嵌入和语音相似性以无监督地纠正语言学习者错误
5. The influence of phonological similarity in adults learning words in a second language. [D] . Stamer, Melissa. 2010

机译：语音相似性对学习第二语言单词的成年人的影响。
6. The Beginning Spanish Lexicon: A Web-based interface to calculate phonological similarity among Spanish words in adults learning Spanish as a foreign language [O] . Michael S. Vitevitch, Melissa K. Stamer, Douglas Kieweg -1

机译：开始西班牙词典：基于网络的界面以计算成年人的西班牙语中的语音相似性以英语为外语
7. THE STUDY OF ENGLISH PHONOLOGICAL ERRORS OF ADVANCED SECOND LANGUAGE LEARNERS IN PRONOUNCING SIMILARLY-SPELLED WORDS [O] . Dangin Dangin, Nurvita Wijayanti 2018

机译：高级第二语言学习者在发音同样拼写的单词中英语语音错误的研究

Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors

摘要

著录项

相似文献

相关主题

期刊订阅