...
首页> 外文期刊>Natural language engineering >WordICA-emergence of linguistic representations for words by independent component analysis
【24h】

WordICA-emergence of linguistic representations for words by independent component analysis

机译:通过独立成分分析得出WordICA语言的语言表示形式的出现

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We explore the use of independent component analysis (ICA) for the automatic extraction of linguistic roles or features of words. The extraction is based on the unsupervised analysis of text corpora. We contrast ICA with singular value decomposition (SVD), widely used in statistical text analysis, in general, and specifically in latent semantic analysis (LSA). However, the representations found using the SVD analysis cannot easily be interpreted by humans. In contrast, ICA applied on word context data gives distinct features which reflect linguistic categories. In this paper, we provide justification for our approach called WordICA, present the WordICA method in detail, compare the obtained results with traditional linguistic categories and with the results achieved using an SVD-based method, and discuss the use of the method in practical natural language engineering solutions such as machine translation systems. As the WordICA method is based on unsupervised learning and thus provides a general means for efficient knowledge acquisition, we foresee that the approach has a clear potential for practical applications.
机译:我们探索使用独立成分分析(ICA)来自动提取单词的语言角色或特征。提取基于文本语料库的无监督分析。我们将ICA与通常在统计文本分析中广泛使用的奇异值分解(SVD)(特别是在潜在语义分析(LSA)中)进行对比。但是,使用SVD分析发现的表示形式不易被人解释。相比之下,应用于单词上下文数据的ICA具有反映语言类别的独特功能。在本文中,我们为称为WordICA的方法提供了依据,详细介绍了WordICA方法,将获得的结果与传统语言类别以及使用基于SVD的方法获得的结果进行比较,并讨论了该方法在实际中的应用。语言工程解决方案,例如机器翻译系统。由于WordICA方法基于无监督学习,因此为有效的知识获取提供了一般手段,因此我们可以预见,该方法在实际应用中具有明显的潜力。

著录项

  • 来源
    《Natural language engineering》 |2010年第3期|P.277-308|共32页
  • 作者单位

    Adaptive Informatics Research Centre, Aalto University School of Science and Technology,P.O.Box 15400, FI-00076 Aalto, Finland;

    Department of Mathematics and Statistics, Department of Computer Science,University of Helsinki, P.O. Box 68, FI-00014 University of Helsinki, Finland Helsinki Institute for Information Technology, University of Helsinki, P.O. Box 68,FI-00014 University of Helsinki, Finland;

    Adaptive Informatics Research Centre, Aalto University School of Science and Technology,P.O. Box 15400, FI-00076 Aalto, Finland;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号