...
首页> 外文期刊>WSEAS Transactions on Computers >Capturing the semantic structure of documents using summaries in Supplemented Latent Semantic Analysis
【24h】

Capturing the semantic structure of documents using summaries in Supplemented Latent Semantic Analysis

机译:使用补充潜在语义分析中的摘要捕获文档的语义结构

获取原文
获取原文并翻译 | 示例

摘要

Latent Semantic Analysis (LSA) is a mathematical technique that is used to capture the semantic structure of documents based on correlations among textual elements within them. Summaries of documents contain words that actually contribute towards the concepts of documents. In the present work, summaries are used in LSA along with supplementary information such as document category and domain information in the model. This modification is referred as Supplemented Latent Semantic Analysis (SLSA) in this paper. SLSA is used to capture the semantic structure of documents using summaries of various proportions instead of entire full-length documents. The performance of SLSA on summaries is empirically evaluated in a document classification application by comparing the accuracies of classification against plain LSA on full-length documents. It is empirically shown that instead of using full-length documents, their summaries can be used to capture the semantic structure of documents.
机译:潜在语义分析(LSA)是一种数学技术,用于根据文档中文本元素之间的相关性来捕获文档的语义结构。文档摘要中包含实际上有助于文档概念的词语。在当前工作中,摘要在LSA中与模型中的文档类别和域信息等补充信息一起使用。该修改在本文中称为“补充潜在语义分析(SLSA)”。 SLSA用于使用各种比例的摘要而不是整个全长文档来捕获文档的语义结构。在文档分类应用程序中,通过将分类准确性与在全长文档中的普通LSA进行比较,经验地评估了SLSA在摘要上的性能。从经验上可以看出,代替使用全长文档,可以使用摘要来捕获文档的语义结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号