Constructing Multiple Domain Taxonomy for Text Processing Tasks

机译：为文本处理任务构造多域分类法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years large volumes of short text data can be easily collected from platforms such as microblogs and product review sites. Very often the obtained short text data contains several domains, which poses many challenges in effective multi-domain text processing because it is challenging to distinguish among the multiple domains in the text data. The concept of multiple domain taxonomy (MDT) has shown promising performance in processing multi-domain text data. However, MDT has to be constructed manually, which requires much expert knowledge about the relevant domains and is time consuming. To address such issues, in this paper, we introduce a semi-automatic method to construct an MDT that only requires a small amount of manual input, in combination of an unsupervised method for ranking multi-domain concepts based on semantic relationships learned from unlabeled data. We show that the iteratively-constructed MDT using our semi-automatic method can achieve higher accuracy than existing methods in domain classification, where the accuracy can be improved by up to 11%.

机译：近年来，可以轻松地从微博和产品评论网站等平台收集大量的短文本数据。通常，所获得的短文本数据包含多个域，这在有效的多域文本处理中提出了许多挑战，因为在文本数据中区分多个域具有挑战性。多域分类法（MDT）的概念在处理多域文本数据中显示出令人鼓舞的性能。但是，MDT必须手动构建，这需要大量有关相关领域的专业知识，并且非常耗时。为了解决这些问题，在本文中，我们引入了一种半自动方法来构造仅需要少量手动输入的MDT，并结合了一种无监督方法，该方法可以根据从未标记数据中学习的语义关系对多域概念进行排名。我们表明，使用我们的半自动方法迭代构造的MDT可以比现有的域分类方法实现更高的准确性，该方法可以将准确性提高多达11％。

著录项

来源
《International conference on database and expert systems applications;International workshop on big data mamagement in cloud systems;International workshop on biological knowledge discovery;International workshop on technologies for information retrieval》|2018年|501-509|共9页
会议地点
作者
Yihong Zhang; Yongrui Qin; Longkun Guo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Using Taxonomic Domain Knowledge in Text Categorization Tasks [J] . GIULIANO ARMANO, FRANCESCO MASCIA, ELOISA VARGIU The International Journal of Intelligent Control and Systems . 2007,第2期

机译：在文本分类任务中使用分类领域知识
2. Defining local food: constructing a new taxonomy - three domains of proximity. [J] . Eriksen S. N. Acta Agriculturae Scandinavica. Section B, Soil and Plant Science . 2013,第Suppla1期

机译：定义当地食物：建立新的分类法-邻近的三个领域。
3. tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification [J] . Blaz Skrlj, Matej Martinc, Jan Kralj, Computer speech and language . 2021,第Jana期

机译：TAX2VEC：构建来自短文分类的分类学分类的可解释特征
4. Constructing Multiple Domain Taxonomy for Text Processing Tasks [C] . Yihong Zhang, Yongrui Qin, Longkun Guo International Conference on Database and Expert Systems Applications . 2018

机译：构建文本处理任务的多个域分类
5. Processing events and spatiality in multiple text domains [D] . Roberts, Kirk E. 2013

机译：在多个文本域中处理事件和空间
6. Automated Learning of Domain Taxonomies from Text using Background Knowledge [O] . Julia Hoxha, Guoqian Jiang, Chunhua Weng -1

机译：使用背景知识从文本自动学习领域分类法
7. Constructing arguments from multiple sources: Tasks that promote understanding and not just memory for text [O] . Jennifer Wiley, James F. Voss 1999

机译：从多个来源构建参数：促进理解的任务，而不仅仅是文本的内存

Constructing Multiple Domain Taxonomy for Text Processing Tasks

摘要

著录项

相似文献

相关主题

期刊订阅