Tibetan-Chinese Cross Language Text Similarity Calculation Based onLDA Topic Model

Sun Yuan; Zhao Qian

首页> 外文期刊>The Open Cybernetics & Systemics Journal >Tibetan-Chinese Cross Language Text Similarity Calculation Based onLDA Topic Model

【24h】

Tibetan-Chinese Cross Language Text Similarity Calculation Based onLDA Topic Model

机译：基于LDA主题模型的藏汉跨语言文本相似度计算

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topic model building is the basis and the most critical module of cross-language topic detection and tracking.Topic model also can be applied to cross-language text similarity calculation. It can improve the efficiency and the speedof calculation by reducing the texts’ dimensionality. In this paper, we use the LDA model in cross-language text similaritycomputation to obtain Tibetan-Chinese comparable corpora: (1) Extending Tibetan-Chinese dictionary by extractingTibetan-Chinese entities from Wikipedia. (2) Using topic model to make the texts mapped to the feature space of topics.(3) Calculating the similarity of two texts in different language according to the characteristics of the news text. Themethod for text similarity calculation based on LDA model reduces the dimensions of text space vector, and enhances theunderstanding of the text’s semantics. It also improves the speed and efficiency of calculation.

机译：主题模型的建立是跨语言主题检测与跟踪的基础和最关键的模块。主题模型也可以用于跨语言文本相似度的计算。通过减少文本的尺寸，可以提高效率和计算速度。本文中，我们在跨语言文本相似度计算中使用LDA模型来获得藏汉可比语料库：（1）通过从维基百科中提取藏汉实体来扩展藏汉词典。（2）使用主题模型使文本映射到主题的特征空间。（3）根据新闻文本的特征，计算两种语言在不同语言中的相似度。基于LDA模型的文本相似度计算方法减少了文本空间矢量的维数，并增强了对文本语义的理解。它还提高了计算速度和效率。

著录项

来源
《The Open Cybernetics & Systemics Journal》 |2017年第1期|共9页
作者
Sun Yuan; Zhao Qian;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类控制论、信息论（数学理论）;
关键词
入库时间 2022-08-18 15:05:55

相似文献

外文文献
中文文献
专利

1. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulić, Wim De Smet, Marie-Francine Moens Information Retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
2. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulic, Wim De Smet, Marie-Francine Moens Information retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
3. Cross-language article linking with different knowledge bases using bilingual topic model and translation features [J] . Wang Yu-Chun, Wu Chun-Kai, Tsai Richard Tzong-Han Knowledge-Based Systems . 2016,第nova1期

机译：使用双语主题模型和翻译功能将跨语言文章链接到不同的知识库
4. Research on cross-language text similarity calculation [C] . Yuan Sun, Qian Zhao International Conference on Electronics Information and Emergency Communication . 2015

机译：跨语言文本相似度计算研究
5. (1) The case for using foreign language pedagogies in introductory computer programming instruction (2) A contextualized pre-AP computer programming curriculum: Models and simulations for exploring real-world cross-curricular topics. [D] . Portnoff, Scott R. 2016

机译：（1）在计算机程序设计入门教学中使用外语教学法的情况（2）情景化的AP前计算机程序设计课程：用于探索现实世界中跨课程主题的模型和模拟。
6. IDSSIM: an lncRNA functional similarity calculation model based on an improved disease semantic similarity method [O] . Wenwen Fan, Junliang Shang, Feng Li, 2020

机译：IDSSIM：基于改进疾病语义相似方法的LNCRNA功能相似性计算模型
7. Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence [O] . Minglai Shao, Liangxi Qin 2014

机译：基于LDA主题模型的文本相似性计算与单词共同发生

Tibetan-Chinese Cross Language Text Similarity Calculation Based onLDA Topic Model

摘要

著录项

相似文献

相关主题

期刊订阅