SANE 2.0: System for fine grained named entity typing on textual data

Lal Anurag; Chowdary C. Ravindranath

首页> 外文期刊>Engineering Applications of Artificial Intelligence >SANE 2.0: System for fine grained named entity typing on textual data

【24h】

SANE 2.0: System for fine grained named entity typing on textual data

机译：SANE 2.0：用于在文本数据上键入细粒度命名实体的系统

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Assignment of fine-grained types to named entities is gaining popularity as one of the major Information Extraction tasks due to its applications in several areas of Natural Language Processing. Existing systems use huge knowledge bases to improve the accuracy of the fine-grained types. We designed and developed SANE 2.0, which is an extended version of our earlier work SANE (Lal et al., 2017). It uses Wikipedia categories to fine grain the type of the named entities recognized in the textual data. The entities for which types could not be found using Wikipedia categories are typed using an intelligent information extraction method that uses search results of yahoo search engine. SANE uses an efficient algorithm to assign the fine-grained type to the entities extracted from the data. Wikipedia categorizes related topics under common headings. From these categories, we constructed a database that contains Wikipedia articles and their corresponding categories. SANE uses this database to predict the category types of named entities. We use Stanford NER to identify named entities with their coarse-grained types. For locations, we use Geonames data separately. We calculate the similarity between an entity and its categories using word2vec. Each entity is assigned to the category that has the highest similarity score with it. Finally, we map the category to the most appropriate WordNet (Miller et al., 1995) type. The main contribution of this work is building a named entity typing system without the use of knowledge bases. Through our experiments, 1) we establish the usefulness of Wikipedia categories to Named Entity Typing, 2) we present an intelligent method of using yahoo search results for Named Entity Typing and 3) we show that SANE's performance is on par with the state-of-the-art.

机译：由于其在自然语言处理的多个领域中的应用，将细粒度类型分配给命名实体作为一种主要的信息提取任务而变得越来越流行。现有系统使用庞大的知识库来提高细粒度类型的准确性。我们设计并开发了SANE 2.0，它是我们早期工作SANE的扩展版本（Lal等人，2017）。它使用Wikipedia类别来细化文本数据中识别的命名实体的类型。使用yahoo搜索引擎的搜索结果，使用智能信息提取方法对使用Wikipedia类别找不到类型的实体进行键入。 SANE使用高效的算法将细粒度类型分配给从数据中提取的实体。维基百科将相关主题归类在通用标题下。从这些类别中，我们构建了一个包含Wikipedia文章及其相应类别的数据库。 SANE使用此数据库来预测命名实体的类别类型。我们使用Stanford NER识别具有粗粒度类型的命名实体。对于位置，我们分别使用地名数据。我们使用word2vec计算实体及其类别之间的相似度。每个实体都被分配到与其具有最高相似性分数的类别。最后，我们将类别映射到最合适的WordNet类型（Miller等，1995）。这项工作的主要贡献是在不使用知识库的情况下构建命名实体类型系统。通过我们的实验，1）我们确定了Wikipedia类别对命名实体键入的有用性，2）我们提出了一种将yahoo搜索结果用于命名实体键入的智能方法，并且3）我们证明了SANE的性能与-艺术。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2019年第9期|11-17|共7页
作者
Lal Anurag; Chowdary C. Ravindranath;
展开▼
作者单位

Indian Inst Technol BHU Varanasi, Dept Comp Sci & Engn, Varanasi 221005, Uttar Pradesh, India;

Indian Inst Technol BHU Varanasi, Dept Comp Sci & Engn, Varanasi 221005, Uttar Pradesh, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Named entity typing; Fined-grained; Wikipedia;

机译：命名实体输入;细粒度;维基百科;

相似文献

外文文献
中文文献
专利

1. SANE 2.0: System for fine grained named entity typing on textual data [J] . Lal Anurag, Chowdary C. Ravindranath Engineering Applications of Artificial Intelligence . 2019,第Sepa期

机译：SANE 2.0：用于细粒度的系统的系统，键入文本数据
2. Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems [J] . Huang Lifu, May Jonathan, Pan Xiaoman, Big Data . 2017,第1期

机译：自由实体提取：细粒度实体键入系统的快速构建
3. A Joint Neural Model for Fine-Grained Named Entity Classification of Wikipedia Articles [J] . Masatoshi SUZUKI, Koji MATSUDA, Satoshi SEKINE, IEICE transactions on information and systems . 2018,第1期

机译：Wikipedia文章的细粒度命名实体分类的联合神经网络模型
4. MZET: Memory Augmented Zero-Shot Fine-grained Named Entity Typing [C] . Tao Zhang, Congying Xia, Chun-Ta Lu, International Conference on Computational Linguistics . 2020

机译：二键：内存增强零射击细粒度命名实体键入
5. Jointly Learning Knowledge Graph Embeddings, Fine Grain Entity Types and Language Models [D] . Patel, Rajat Hareshkumar. 2020

机译：联合学习知识图形嵌入，精细谷物实体类型和语言模型
6. Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems [O] . Lifu Huang, Jonathan May, Xiaoman Pan, -1

机译：自由实体提取：细粒度实体键入系统的快速构建
7. MZET: Memory Augmented Zero-Shot Fine-grained Named Entity Typing [O] . Tao Zhang, Congying Xia, Chun-Ta Lu, 2020

机译：二键：内存增强零拍摄的细粒度命名实体键入
8. Naming Forum: Proceedings of the IRDS Workshop on Data Entity Naming Conventions [R] . Newton, J. J. 1990

机译：命名论坛：IRDs数据实体命名约定研讨会的会议记录

SANE 2.0: System for fine grained named entity typing on textual data

摘要

著录项

相似文献

相关主题

期刊订阅