Reducing Feature Embedding Data for Discovering Relations in Big Text Data

机译：减少特征嵌入数据以发现大文本数据中的关系

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Relation extraction is a critical task in building a knowledge base from unstructured text documents. Most works in automatic relation extraction have applied deep learning techniques such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) in large text corpora. However, they require a large amount of human labelling data, which is labour intensive and is hardly applied in a new domain of document without human supervision. This paper proposes a novel framework to extract relations in multi-domain texts effectively. In particular, we construct the framework in three phases including preprocessing, feature embedding and relation extraction. We show that a small proportion of training data is sufficient to train our relation extraction framework and achieve a good accuracy in relation extraction works.

机译：关系提取是从非结构化文本文件构建知识库的关键任务。大多数在自动关系提取中的作品应用了大型文本语料库中的深度学习技术，如卷积神经网络（CNN）和长短期内存（LSTM）。然而，它们需要大量的人类标签数据，这是劳动密集型的，并且几乎没有应用于没有人类监督的新文件领域。本文提出了一种新颖的框架，有效地提取多域文本中的关系。特别是，我们在三个阶段构建框架，包括预处理，特征嵌入和关系提取。我们表明，小比例的培训数据足以培训我们的关系提取框架，并在关系提取工程中实现良好的准确性。

著录项

来源
《IEEE International Congress on Big Data》|2019年|1 v.|共5页
会议地点
作者
Haojie Huang; Raymond Wong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
convolutional neural nets; learning (artificial intelligence); text analysis;

机译：卷积神经网络;学习（人工智能）;文本分析;

相似文献

外文文献
中文文献
专利

1. Entity Extraction for Malayalam Social Media Text Using Structured Skip-gram Based Embedding Features from Unlabeled Data [J] . G. Remmiya Devi, P.V. Veena, M. Anand Kumar, Procedia Computer Science . 2016,第1期

机译：使用基于结构化跳过图的嵌入特征从未标记数据中提取马拉雅拉姆语社交媒体文本的实体
2. Re-discover Values of Data Using Data Jackets by Combining Cluster with Text Analysis [J] . Yanyuan Zeng, Yukio Ohsawa Procedia Computer Science . 2017,第1期

机译：通过将聚类与文本分析结合使用数据夹克来重新发现数据的价值
3. An Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets [J] . Zhang Ziqi, Gentile Anna Lisa, Blomqvist Eva, Semantic web . 2017,第3期

机译：一个无监督的数据驱动方法，用于发现大型链接数据集中的等同关系
4. Reducing Feature Embedding Data for Discovering Relations in Big Text Data [C] . Haojie Huang, Raymond Wong IEEE International Congress on Big Data . 2019

机译：减少特征嵌入数据以发现大文本数据中的关系
5. Contrast and compact data mining: Discovering novel and useful patterns from large databases. [D] . Wan, Qian. 2009

机译：对比和紧凑的数据挖掘：从大型数据库中发现新颖而有用的模式。
6. Text Categorization of Heart Lung and Blood Studies in the Database of Genotypes and Phenotypes (dbGaP) Utilizing n-grams and Metadata Features [O] . Mindy K. Ross, Ko-Wei Lin, Karen Truong, 2013

机译：利用n-gram和元数据特征对基因型和表型（dbGaP）数据库中的心脏肺和血液研究进行文本分类
7. On the creation of derivative tantras of the Kālacakra teaching. Discovering common text block in the Sekoddeśa and the Śrīkālacakra and revealing traditional data on the relation of the Sekoddeśa to the Ādibuddha [O] . A. M. Strelkov 2020

机译：论kālacakra教学的衍生念头的创造。在Sekoddeśa和śrīkālacakra中发现常见的文本块，并透露关于sekoddeśa与ādibuddha的关系的传统数据

Reducing Feature Embedding Data for Discovering Relations in Big Text Data

摘要

著录项

相似文献

相关主题

期刊订阅