首页> 外文会议>Information Resources Management Association International Conference vol.1; 20040523-26; New Orleans,LA(US) >Distributed Data Mining and its Applications to Intelligent Textual Information Processing
【24h】

Distributed Data Mining and its Applications to Intelligent Textual Information Processing

机译:分布式数据挖掘及其在智能文本信息处理中的应用

获取原文
获取原文并翻译 | 示例

摘要

Textual information processing is of fundamental importance, due to the massive amount of documents, especially online textual information that we need to process every day. In this paper, we study data mining techniques applied to intelligent textual information processing in distributed environments, including text classification, information extraction (IE) and topic detection and tracking (TDT). These intelligent processing techniques will improve the quality and efficiency of information resource management and utilization. Their statistical models and computational algorithms challenge the researches in data mining and distributed/parallel computing. When successfully applied, they will help enhance and benefit applications in IT, digital library, and information retrieval. Specifically, we study the distributed computing of the following algorithms: naive Bayes classifier combined with expectation-maximization (EM) for text classification, hidden Markov model for information extraction, and deterministic annealing with EM for topic detection and tracking. We also study the performances of the proposed algorithms and experiment on the improvements.
机译:由于大量的文档,尤其是我们每天需要处理的在线文本信息,文本信息处理至关重要。在本文中,我们研究了应用于分布式环境中智能文本信息处理的数据挖掘技术,包括文本分类,信息提取(IE)和主题检测与跟踪(TDT)。这些智能处理技术将提高信息资源管理和利用的质量和效率。他们的统计模型和计算算法对数据挖掘和分布式/并行计算的研究提出了挑战。成功应用后,它们将有助于增强和受益于IT,数字图书馆和信息检索中的应用程序。具体来说,我们研究以下算法的分布式计算:朴素贝叶斯分类器结合期望最大化(EM)进行文本分类,隐藏Markov模型进行信息提取,以及使用EM进行确定性退火以进行主题检测和跟踪。我们还研究了所提出算法的性能并进行了改进实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号