首页> 外国专利> METHOD FOR AUTOMATICALLY EXTRACTING KOREAN FOREIGN WORD

METHOD FOR AUTOMATICALLY EXTRACTING KOREAN FOREIGN WORD

机译:自动提取韩语外来词的方法

摘要

PPROBLEM TO BE SOLVED: To automatically detect(extract) foreign words in a Korean corpus without making it necessary to execute the morphemic analysis processing of a translation corpus by using phonologically similar characteristics of both Japanese/Korean foreign words, and automatically extracting the foreign words in the Korean corpus with Japanese KATAKANA words as a clue. PSOLUTION: A Korean corpus are Roman character-converted, and a Roman character-converted text is divided into phrases according to a Korean word division writing rule, and postpositional particles and affixes connected to the tail of words are removed from the divided phrases so that the words can be obtained, and the words whose similarity with Roman character-converted Japanese KATAKANA words are extracted as Korean foreign words from the words being foreign word candidates by removing any phrase including alphabetic characters which are not used for foreign word notation and the words registered in an existing Korean dictionary. PCOPYRIGHT: (C)2005,JPO&NCIPI
机译:

要解决的问题:自动检测(提取)朝鲜语语料库中的外来词而无需通过使用日语/韩语外来词的音素相似特征来执行翻译语料库的词素分析处理,并自动提取朝鲜语语料库中的外来词,以日文片假名作为线索。

解决方案:将韩语语料库转换为罗马字符,并根据韩语分词书写规则将罗马字符转换后的文本分为短语,然后从分词中删除与词尾相关的后置词和词缀从而获得单词,并且通过去除包括不用于外语符号的任何包括字母字符的短语,从作为外语候选词的单词中提取与罗马字符转换的日语片假名单词相似的单词作为韩文外语单词以及在现有韩语词典中注册的单词。

版权:(C)2005,JPO&NCIPI

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号