首页> 外文期刊>Mathematiques et Sciences Humaines (Print) >Apprentissage d'un ensemble pré-structuré de concepts d'un domaine : l'outil Galex
【24h】

Apprentissage d'un ensemble pré-structuré de concepts d'un domaine : l'outil Galex

机译:学习一组预先构造的领域概念:Galex工具

获取原文
           

摘要

The huge amount of electronic textual information increases exponentially just as easily as archives and working documents in academic organizations, in administration and in firms. A solution for structuring this mountain of textual database is to build a knowledge model to index this information. One way can be obtained by data extraction and classification producing conceptual indexing by knowledge acquisition. Traditionally the classification methods of Data Analysis were adapted while used for the classical table of data under an object/characteristics/value format. We present Galex (Graph Analyzer for LEXicometry) which develops structuration of knowledge by a term clustering method. This structuration synthetizes the content of information providing the mapping data to information filtering or hypertextual navigation on similar documents. Galex aims at taking into account the nature of the data to which it is applied : natural language. The complexity of natural language is well known: sense ambiguity, multiple grammatical construction of sentence, style, term creationáWe show through integration of poorly defined, though useful as concept, ontology, term and corpus, notions that clustering can be improved by adding linguistic knowledge. We base our approach on typical phenomena such as graph-statistical relations between terms, scheme relations in a context and canonical reduction of variants.
机译:大量的电子文本信息就像在学术组织,行政部门和公司中的档案和工作文件一样容易地呈指数增长。构建此文本数据库的解决方案是建立一个知识模型来对这些信息建立索引。一种方法可以通过数据提取和分类来获得,该分类通过知识获取产生概念索引。传统上,对数据分析的分类方法进行了修改,同时将其用于对象/特征/值格式下的经典数据表。我们介绍Galex(用于词法分析的图形分析器),它通过术语聚类方法发展知识的结构化。这种结构综合了将映射数据提供给相似文档上的信息过滤或超文本导航的信息内容。 Galex旨在考虑将其应用于的数据的性质:自然语言。自然语言的复杂性众所周知:感觉模糊,句子,样式,术语创造的多种语法结构……我们通过整合定义不明确的词(虽然对概念,本体,术语和语料库很有用)来表明,可以通过添加语言知识来改善聚类的概念。我们的方法基于典型现象,例如术语之间的图统计关系,上下文中的方案关系以及变体的规范归约。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号