首页> 外文期刊>Neurocomputing >How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: Application to text mining
【24h】

How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: Application to text mining

机译:如何提高Kohonen映射的鲁棒性并在“因子分析:文本挖掘中的应用”中显示其他信息

获取原文
获取原文并翻译 | 示例
       

摘要

This article is an extended version of a paper presented in the WSOM'2012 conference (Bourgeois et al., 2012 [1]). We display a combination of factorial projections, SOM algorithm and graph techniques applied to a text mining problem. The corpus contains eight medieval manuscripts which were used to teach arithmetic techniques to merchants. Among the techniques for Data Analysis, those used for Lexicometry (such as Factorial Analysis) highlight the discrepancies between manuscripts. The reason for this is that they focus on the deviation from the independence between words and manuscripts. Still, we also want to discover and characterize the common vocabulary among the whole corpus. Using the properties of stochastic Kohonen maps, which define neighborhood between inputs in a non-deterministic way, we highlight the words which seem to play a special role in the vocabulary. We call them fickle and use them to improve both Kohonen map robustness and significance of FCA visualization. Finally we use graph algorithmic to exploit this fickleness for classification of words.
机译:本文是在WSOM'2012会议上发表的论文的扩展版本(Bourgeois等,2012 [1])。我们展示了应用于文本挖掘问题的阶乘投影,SOM算法和图形技术的组合。语料库包含八本中世纪手稿,用于向商人传授算术技巧。在用于数据分析的技术中,用于词法分析的技术(例如因子分析)突出了手稿之间的差异。这样做的原因是,他们专注于单词和手稿之间独立性的偏离。尽管如此,我们也想发现和表征整个语料库中的常用词汇。使用随机的Kohonen映射的属性(以一种不确定的方式定义输入之间的邻域),我们突出显示似乎在词汇表中扮演特殊角色的单词。我们称它们为善变的,并使用它们来改善Kohonen地图的鲁棒性和FCA可视化的重要性。最后,我们使用图算法来利用这种多变的词语分类。

著录项

  • 来源
    《Neurocomputing》 |2015年第5期|120-135|共16页
  • 作者单位

    SAMM - Universite Paris 1 Pantheon-Sorbonne 90, rue de Tolbiac, 75013 Paris, France;

    SAMM - Universite Paris 1 Pantheon-Sorbonne 90, rue de Tolbiac, 75013 Paris, France;

    PIREH-LAMOP - Universite Paris 1 Pantheon-Sorbonne 1, rue Victor Cousin, Paris, France;

    PIREH-LAMOP - Universite Paris 1 Pantheon-Sorbonne 1, rue Victor Cousin, Paris, France;

    SAMM - Universite Paris 1 Pantheon-Sorbonne 90, rue de Tolbiac, 75013 Paris, France;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Text mining; Kohonen maps; Factorial Analysis; Middle ages scientific literature; Graphs;

    机译:文本挖掘;Kohonen地图;析因分析;中世纪的科学文献;图表;
  • 入库时间 2022-08-18 02:06:48

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号