首页> 外文会议>Information Retrieval Technology >A Refinement Framework for Cross Language Text Categorization
【24h】

A Refinement Framework for Cross Language Text Categorization

机译:跨语言文本分类的改进框架

获取原文

摘要

Cross language text categorization is the task of exploiting labelled documents in a source language (e.g. English) to classify documents in a target language (e.g. Chinese). In this paper, we focus on investigating the use of a bilingual lexicon for cross language text categorization. To this end, we propose a novel refinement framework for cross language text categorization. The framework consists of two stages. In the first stage, a cross language model transfer is proposed to generate initial labels of documents in target language. In the second stage, expectation maximization algorithm based on naive Bayes model is introduced to yield resulting labels of documents. Preliminary experimental results on collected corpora show that the proposed framework is effective.
机译:跨语言文本分类是以源语言(例如英语)利用标记文档的任务,以将文档分类为目标语言(例如中文)。在本文中,我们专注于调查使用双语词典进行跨语言文本分类。为此,我们向跨语言文本分类提出了一种新的细化框架。该框架由两个阶段组成。在第一阶段,提出了一种跨语言模型传输,以在目标语言中生成初始文件标签。在第二阶段,引入了基于天真贝叶斯模型的期望最大化算法,从而产生了文件的结果。收集的Corpora初步实验结果表明,所提出的框架是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号