Domain-Independent Automated Processing of Free-Form Text Data in Telecom

机译：电信中格式自由的文本数据的域独立自动处理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Free-form, unstructured and semi-structured textual data has become increasingly more prevalent in the telecommunications industry, with service and equipment providers alike. Some typical examples include textual data from customer care tickets, machine logs, alarm and alerting systems, and diagnostics. There is a growing business need to rapidly and automatically understand the underlying key topics and categories of this bulk collection of text. With the present mode of operation of relying on domain experts to analyze textual data, there is a clear need to apply text analytics to automate the process. Difficulties arise due to the jargon-filled and fragmented, incomplete nature of textual data in this field. In this paper, we propose a domain-agnostic, unsupervised approach that deploys a multi-stage text processing pipeline for automatically discovering the key topics and categories from free-form text documents. Using anonymized datasets retrieved from actual customer care tickets and system logs, we show that our approach outperforms traditional text mining approaches, and performs comparably to manual categorization tasks that were undertaken by domain experts with full system knowledge.

机译：自由格式，非结构化和半结构化的文本数据在电信行业变得越来越普遍，服务和设备提供商也是如此。一些典型示例包括来自客户服务票证，机器日志，警报和警报系统以及诊断的文本数据。迅速增长的业务需求是快速，自动地理解此大量文本集合的基本关键主题和类别。利用依靠领域专家来分析文本数据的当前操作模式，显然需要应用文本分析来使过程自动化。由于该字段中的术语数据充满行话和零散，不完整的性质，因此出现了困难。在本文中，我们提出了一种与领域无关的无监督方法，该方法部署了多阶段文本处理管道，用于自动从自由格式文本文档中发现关键主题和类别。使用从实际客户服务单和系统日志中检索的匿名数据集，我们证明了我们的方法优于传统的文本挖掘方法，并且与由具有完整系统知识的领域专家执行的手动分类任务相类似。

著录项

来源
《IEEE International Conference on Data Engineering》|2019年|1841-1849|共9页
会议地点
作者
Rajarshi Bhowmik; Ahmet Akyamac;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text processing; Pipelines; Telecommunications; Data mining; Feature extraction; Frequency measurement; Business;

机译：文本处理;管道;电信;数据挖掘;特征提取;频率测量;业务;

相似文献

外文文献
中文文献
专利

1. A domain-independent process for automatic ontology population from text [J] . Carla Faria, Ivo Serra, Rosario Girardi Science of Computer Programming . 2014,第pta1期

机译：来自文本的自动本体填充的域独立过程
2. Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study [J] . Amy Y X Yu, Zhongyu A Liu, Chloe Pou-Prom, JMIR Medical Informatics . 2021,第5期

机译：自动语言处理自由文本放射学报告自动化冲程数据提取：仪器验证研究
3. Extracting statistical data from free-form text [J] . Hill L. Owen, Zein David A. Circuits and Devices Magazine, IEEE . 1986,第3期

机译：从自由格式文本中提取统计数据
4. Domain-Independent Automated Processing of Free-Form Text Data in Telecom [C] . Rajarshi Bhowmik, Ahmet Akyamac IEEE International Conference on Data Engineering . 2019

机译：在电信中独立于自动自动处理自由形式文本数据
5. Automated generation of metadata for mining image and text data. [D] . Al-Shameri, Faleh Jassem. 2006

机译：自动生成用于挖掘图像和文本数据的元数据。
6. Natural Language Processing and Automatic SNOMED-Encoding of Free Text: An Analysis of Free Text Data from a Routine Electronic Patient Record Application with a Parsing Tool Using the German SNOMED II [O] . Joerg H. Hohnloser, Matthias Holzer, Martin R.G. Fischer, 1996

机译：自然语言处理和自由文本的自动SNOMED编码：使用德语SNOMED II的解析工具对例行电子病历应用中的自由文本数据进行分析
7. Methodology of data collection and processing for the creation of associative and metaphoric dictionary of the Russian language designed for automated text processing systems (AMD-ATPS) [O] . Nikolay V. Golovko 2018

机译：用于创建用于自动化文本处理系统的俄语和隐喻词典的数据收集和处理的方法（AMD-ATP）
8. Security Classification Using Automated Learning (SCALE): Optimizing Statistical Natural Language Processing Techniques to Assign Security Labels to Unstructured Text [R] . Brown, J. D., Charlebois, D. 2010

机译：使用自动学习的安全性分类（sCaLE）：优化统计自然语言处理技术，将安全标签分配给非结构化文本

Domain-Independent Automated Processing of Free-Form Text Data in Telecom

摘要

著录项

相似文献

相关主题

期刊订阅