首页> 外文会议>IEEE International Conference on Data Engineering >Domain-Independent Automated Processing of Free-Form Text Data in Telecom
【24h】

Domain-Independent Automated Processing of Free-Form Text Data in Telecom

机译:在电信中独立于自动自动处理自由形式文本数据

获取原文

摘要

Free-form, unstructured and semi-structured textual data has become increasingly more prevalent in the telecommunications industry, with service and equipment providers alike. Some typical examples include textual data from customer care tickets, machine logs, alarm and alerting systems, and diagnostics. There is a growing business need to rapidly and automatically understand the underlying key topics and categories of this bulk collection of text. With the present mode of operation of relying on domain experts to analyze textual data, there is a clear need to apply text analytics to automate the process. Difficulties arise due to the jargon-filled and fragmented, incomplete nature of textual data in this field. In this paper, we propose a domain-agnostic, unsupervised approach that deploys a multi-stage text processing pipeline for automatically discovering the key topics and categories from free-form text documents. Using anonymized datasets retrieved from actual customer care tickets and system logs, we show that our approach outperforms traditional text mining approaches, and performs comparably to manual categorization tasks that were undertaken by domain experts with full system knowledge.
机译:自由形式,非结构化和半结构化文本数据在电信行业中越来越普遍,服务和设备提供商相似。一些典型的示例包括来自客户服务票证,机器日志,报警和警报系统以及诊断的文本数据。越来越多的业务需要迅速,自动理解该批量收集文本的基础关键主题和类别。通过依赖域专家对域专家进行分析文本数据的目前的操作模式,有明确需要将文本分析应用于自动化过程。由于行话填充和碎片,文本数据中的文本数据的不完全性质,出现困难。在本文中,我们提出了一个域名无神不可化的方法,部署了一个多级文本处理管道,用于自动发现自由窗体文本文档的关键主题和类别。使用从实际的客户服务票证和系统日志检索的匿名数据集,我们表明我们的方法优于传统的文本挖掘方法,并与手动分类任务相对,这些任务是由域专家进行全面的系统知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号