首页> 外文期刊>ACM transactions on Asian language information processing >Visually and Phonologically Similar Characters in Incorrect Chinese Words: Analyses, Identification, and Applications
【24h】

Visually and Phonologically Similar Characters in Incorrect Chinese Words: Analyses, Identification, and Applications

机译:错误中文单词的视觉和语音相似字符:分析,识别和应用

获取原文
获取原文并翻译 | 示例

摘要

Information about students' mistakes opens a window to an understanding of their learning processes, and helps us design effective course work to help students avoid replication of the same errors. Learning from mistakes is important not just in human learning activities; it is also a crucial ingredient in techniques for the developments of student models. In this article, we report findings of our study on 4,100 erroneous Chinese words. Seventy-six percent of these errors were related to the phonological similarity between the correct and the incorrect characters, 46% were due to visual similarity, and 29% involved both factors. We propose a computing algorithm that aims at replication of incorrect Chinese words. The algorithm extends the principles of decomposing Chinese characters with the Cangjie codes to judge the visual similarity between Chinese characters. The algorithm also employs empirical rules to determine the degree of similarity between Chinese phonemes. To show its effectiveness, we ran the algorithm to select and rank a list of about 100 candidate characters, from more than 5,100 characters, for the incorrectly written character in each of the 4,100 errors. We inspected whether the incorrect character was indeed included in the candidate list and analyzed whether the incorrect character was ranked at the top of the candidate list. Experimental results show that our algorithm captured 97% of incorrect characters for the 4,100 errors, when the average length of the candidate lists was 104. Further analyses showed that the incorrect characters ranked among the top 10 candidates in 89% of the phonologically similar errors and in 80% of the visually similar errors.
机译:有关学生错误的信息为了解他们的学习过程打开了一个窗口,并帮助我们设计了有效的课程作业,以帮助学生避免重复同样的错误。从错误中学习不仅在人类的学习活动中很重要;它也是开发学生模型的技术中的关键要素。在本文中,我们报告了对4,100个错误中文单词的研究结果。这些错误中的百分之七十六与正确和不正确字符之间的语音相似性有关,百分之四十六与视觉相似性有关,百分之二十九涉及这两个因素。我们提出了一种旨在复制不正确的中文单词的计算算法。该算法扩展了用仓jie码分解汉字的原理,以判断汉字之间的视觉相似性。该算法还采用经验规则来确定中文音素之间的相似度。为了证明其有效性,我们运行了该算法,以从4,100个以上的字符中选择和排列约100个候选字符的列表,以查找4100个错误中每个错误写入的字符。我们检查了不正确的字符是否确实包含在候选列表中,并分析了不正确的字符是否位于候选列表的顶部。实验结果表明,当候选列表的平均长度为104时,我们的算法捕获了4,100个错误的97%的不正确字符。进一步的分析表明,不正确的字符在89%的语音相似错误和在80%的视觉相似错误中。

著录项

  • 来源
  • 作者单位

    Department of Computer Science, College of Science, National Chengchi University, Taipei, Taiwan;

    Department of Computer Science, College of Science, National Chengchi University, Taipei, Taiwan;

    Department of Computer Science, College of Science, National Chengchi University, Taipei, Taiwan;

    Department of Computer Science, College of Science, National Chengchi University, Taipei, Taiwan;

    Department of Computer Science and Information Engineering, College of Informatics, Chaoyang University of Technology, Taichung, Taiwan;

    Institute of Linguistics, Academia Sinica, Taipei, Taiwan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    design; languages;

    机译:设计;语言;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号