首页> 外文会议>International Conference on Image and Signal Processing >Graph-Based Text Modeling: Considering Mathematical Semantic Linking to Improve the Indexation of Arabic Documents
【24h】

Graph-Based Text Modeling: Considering Mathematical Semantic Linking to Improve the Indexation of Arabic Documents

机译:基于图形的文本建模:考虑数学语义链接以提高阿拉伯文档的分期

获取原文

摘要

Indexing unstructured documents aims to build a list of words, or concepts, which will simplify the exploration of their exploration later on. The most used model for text modeling is the Vector Space Model. In spite of the simplicity of this model in its implementation and its wide use in different researches in the field of text mining and information retrieval, it has an important limit, which is ignoring the semantic relation between the different textual units, by considering them as independent. However, there is a more suitable technique in Data Mining to highlight the semantic linkage between text units, which is the graph-based representation. A graph can easily be adapted to the textual data by representing words as a vertex and the relation between them as edges. In this work, we have introduced the graph based modeling of textual document. Thus, we conducted a study about the impact of the choice of the semantic relation between the text units on the indexation of documents. We have validated our results through classification results.
机译:索引非结构化文件旨在建立一个单词或概念的列表,这将简化稍后对其探索的探索。最常用的文本建模模型是矢量空间模型。尽管该模型在其实施中的简单性及其在文本挖掘和信息检索领域的不同研究中,但它具有重要的限制,这是通过将它们视为的不同文本单位之间的语义关系而忽略了一个重要的极限独立的。然而,数据挖掘中存在更合适的技术,以突出显示文本单元之间的语义链接,这是基于图形的表示。通过将单词作为顶点表示单词和它们之间的关系,可以轻松地适应文本数据。在这项工作中,我们介绍了基于图形的文本文档的建模。因此,我们对文本单位之间的语义关系选择的影响进行了研究。我们通过分类结果验证了我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号