首页> 外文会议>Hawaii International Conference on System Sciences, Annual >The TaxGen Framework: Automating the Generation of a Taxonomy for a Large Document Collection
【24h】

The TaxGen Framework: Automating the Generation of a Taxonomy for a Large Document Collection

机译:TAXGEN框架:自动化为大型文件收集生成分类法

获取原文

摘要

Text Mining is an active area of research and development, which combines and expands techniques found in related areas like information retrieval, computational linguistics, and data mining to perform an analysis of large corpora of digital documents. This paper describes the TaxGen Text Mining project carried out at the IBM Software Development Lab. at Boeblingen, Germany. The goal of TaxGen was the automatic generation of a taxonomy for a collection of previously unstructured documents, namely a set of 73.000 news wire documents spanning one year.
机译:文本挖掘是一个活跃的研发领域,它结合和扩展了信息检索,计算语言学和数据挖掘等相关领域的技术,以分析了数字文件的大型语料库。本文介绍了IBM软件开发实验室执行的TAXGEN TEXT挖掘项目。在德国Boeblingen。 TAXGEN的目标是自动生成一系列以前非结构化文件的分类物,即一年的一组73.000新闻文件文件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号