首页>
外国专利>
CROSSLINGUAL TEXT CLASSIFICATION METHOD USING EXPECTED FREQUENCIES
CROSSLINGUAL TEXT CLASSIFICATION METHOD USING EXPECTED FREQUENCIES
展开▼
机译:基于期望频率的跨语言文本分类方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method that, given a bag-of-words representation of a text snippet written in a source language, calculates an expected bag-of- words representation in a target language, includes: a step in which, for a source word in the input bag-of-words, a probability that the source word is translated into a target word is calculated by using given probabilities that the target word is translated into the source word and by using co-occurrence probabilities of two or more target words that are calculated from a corpus written in the target language; and a step in which the probability that the target word is a translation of the source word is summed up to denote an expected count of the target word, and to create a feature vector by using the expected counts; the resulting feature vector in the target language being considered as the expected bag-of-words representation that represents the input bag-of-words.
展开▼