Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study

HAMOU Reda Mohamed; LEHIRECHE Ahmed; LOKBANI Ahmed Chaouki; RAHMANI Mohamed

首页> 外文期刊>Computer and Information Science >Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study

【24h】

Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study

机译：用词网和n-gram表示文本文件用于二维细胞自动机的无监督分类（聚类）：一项比较研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Normal 0 21 false false false MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable{mso-style-name:"Tableau Normal";mso-tstyle-rowband-size:0;mso-tstyle-colband-size:0;mso-style-noshow:yes;mso-style-parent:"";mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-para-margin:0cm;mso-para-margin-bottom:.0001pt;mso-pagination:widow-orphan;font-size:10.0pt;font-family:"Times New Roman";mso-ansi-language:#0400;mso-fareast-language:#0400;mso-bidi-language:#0400;} In this article we present a 2D cellular automaton (Class_AC) to solve a problem of text mining in the case of unsupervised classification (clustering). Before to experiment the cellular automaton, we vectorized our data indexing textual documents from the database REUTERS 21,578 by Wordnet approach and the representation of text documents by the method n-grams. Our work is to make a comparative study of two approaches to representation that is the conceptual approach (Wordnet) and the n-grams. Section 1 gives an introduction on the biomimétisme and text mining, Section 2 presents r epresentation of texts based on Wordnet approach and ? the n grams , Section 3 ? describes the cellular automaton for clustering, Section 4 shows the experimentation and comparison results and finally Section 5 ? gives a conclusion and perspectives.

机译：正常0 21否否否MicrosoftInternetExplorer4 / *样式定义* / table.MsoNormalTable {mso-style-name：“ Tableau Normal”; mso-tstyle-rowband-size：0; mso-tstyle-colband-size：0; mso- style-noshow：是; mso-style-parent：“”; mso-padding-alt：0cm 5.4pt 0cm 5.4pt; mso-para-margin：0cm; mso-para-margin-bottom：.0001pt; mso-pagination ：寡妇孤儿;字体大小：10.0pt;字体家族：“ Times New Roman”; mso-ansi语言：＃0400; mso-fareast语言：＃0400; mso-bidi语言：＃0400;}在本文中，我们提出了一种二维元胞自动机（Class_AC），以解决无监督分类（聚类）情况下的文本挖掘问题。在尝试元胞自动机之前，我们通过Wordnet方法对数据库REUTERS 21,578中的文本文档进行了数据索引，并通过n-gram方法对文本文档的表示进行了矢量化处理。我们的工作是对两种表示方法进行比较研究，即概念方法（Wordnet）和n-gram。第1节介绍了生物记忆和文本挖掘，第2节介绍了基于Wordnet方法和？的文本表示。 n克，第3节？描述了用于聚类的元胞自动机，第4节显示了实验和比较结果，最后是第5节？给出结论和观点。

著录项

来源
《Computer and Information Science》 |2010年第3期|共5页
作者
HAMOU Reda Mohamed; LEHIRECHE Ahmed; LOKBANI Ahmed Chaouki; RAHMANI Mohamed;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A new unsupervised method for document clustering by using WordNet lexical and conceptual relations [J] . Diego Reforgiato Recupero Information retrieval . 2007,第6期

机译：利用WordNet词汇和概念关系的一种新的无监督文档聚类方法
2. Does 3D Cellular Automata Using WordNet Can Improve Text Clustering? [J] . Abdelmalek Amine, Reda Mohamed Hamou, Michel Simonet International journal of digital library systems . 2014,第2期

机译：使用WordNet的3D细胞自动机能否改善文本聚类？
3. DNA data clustering by combination of 3D cellular automata and n-grams for structure molecule prediction [J] . Fatima Kabli, Reda Mohamed Hamou Abdelmalek Amine International journal of bioinformatics research and applications . 2016,第4期

机译：通过3D蜂窝自动机和N-GRAM组合进行DNA数据聚类，用于结构分子预测
4. Text Clustering by 2D Cellular Automata Based on the N-Grams [C] . Hamou Reda Mohamed, Lehireche Ahmed, Lokbani Ahmed Chaouki, 2010 First ACIS International Symposium on Cryptography and Network Security, Data Mining and Knowledge Discovery, E-Commerce Its Applications and Embedded Systems . 2010

机译：基于N-Grams的二维细胞自动机的文本聚类
5. A comparative study on ontology generation and text clustering using VSM, LSI, and document ontology models. [D] . Taylor, William P., II. 2007

机译：使用VSM，LSI和文档本体模型进行本体生成和文本聚类的比较研究。
6. Computing symmetrical strength of N-grams: a two pass filtering approach in automatic classification of text documents [O] . Deepak Agnihotri, Kesari Verma, Priyanka Tripathi -1

机译：计算N-gram的对称强度：文本文档自动分类中的两遍过滤方法
7. Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study [O] . RAHMANI Mohamed, LOKBANI Ahmed Chaouki, LEHIRECHE Ahmed, 2010

机译：用词网和n-gram表示文本文档的二维细胞自动机无监督分类（聚类）的比较研究

Representation of textual documents by the approach wordnet and n-grams for the unsupervised classification (clustering) with 2D cellular automata: a comparative study

摘要

著录项

相似文献

相关主题

期刊订阅