首页> 外文会议>International conference on computer science and it applications >Mining Semantic Tags in a Content Analysis System for a Letter Database of Ethnic Koreans Living in China
【24h】

Mining Semantic Tags in a Content Analysis System for a Letter Database of Ethnic Koreans Living in China

机译:矿业分析系统中的挖掘语义标签,为中国历史上的民族韩国人的字母数据库

获取原文

摘要

In this paper, we present a content analysis system for the letter database for ethnic Koreans living in China using several machine learning techniques. First, letters from ethnic Koreans living in China to Korea Broadcasting System (KBS) to find separated families were digitized and constructed as a letter database. Second, to help retrieval of digitized letters, semantic tags were annotated to each letter manually. Third, to find the contents of the digitized letters, machine learning techniques were applied to those semantic tags. Tag-based topic modeling was performed and a tag network was constructed. Over 150,000 letters were scanned as image files and have been constructed as a letter database. Among them, 3,500 letters are randomly selected and analyzed. Our approach shows that the semantic tags play an important role in the digitized letter retrieval. We are currently enhancing algorithms for tag analysis, and the developed system will broaden the horizon of digital humanities.
机译:在本文中,我们使用多种机器学习技术为居住在中国的民族韩国人的信函数据库内容分析系统。首先,从中国生活在韩国广播系统(KBS)中寻找分离的家庭的韩国人的信件被数字化并构建为字母数据库。其次,为了帮助检索数字化字母,语义标签手动向每个字母注释。第三,要找到数字化字母的内容,将计算机学习技术应用于这些语义标签。执行基于标签的主题建模,并构建标签网络。超过150,000个字母被扫描为图像文件,并被构造为字母数据库。其中,随机选择3,500个字母并分析。我们的方法表明,语义标签在数字化字母检索中发挥着重要作用。我们目前正在增强标签分析的算法,开发系统将扩大数字人文的地平线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号