Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web

机译：从万维网自动获取具名实体标记语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems. We use an NE list and an web search engine to collect web documents which contain the NE instances. The documents are refined through sentence separation and text refinement procedures and NE instances are finally tagged with the appropriate NE categories. Our experiments demonstrates that the suggested method can acquire enough NE tagged corpus equally useful to the manually tagged one without any human intervention.

机译：在本文中，我们提出了一种方法，该方法可从网络上自动构建一个带有名称实体（NE）标记的语料库，以用于名称实体识别系统的学习。我们使用网元列表和网络搜索引擎来收集包含网元实例的网络文档。通过句子分离和文本细化过程来细化文档，并最终用适当的NE类别标记NE实例。我们的实验表明，所建议的方法无需人工干预即可获得足够多的NE标记语料库，该语料库与手动标记的语料库同样有用。

著录项

来源
《Proceedings of the Student Research Workshop, Interactive Posters/Demonstrations, and Tutorial Abstracts》|2003年|P.165-168|共4页
会议地点
作者
Joohui An; Seungwoo Lee; Gary Geunbae Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. GENETAG: a tagged corpus for gene/protein named entity recognition [J] . Lorraine Tanabe, Natalie Xie, Lynne H Thom, BMC Bioinformatics . 2005,第SUPPLEMENTa1期

机译：GENETAG：用于基因/蛋白质命名实体识别的标记语料库
2. An Automatically Generated Annotated Corpus for Albanian Named Entity Recognition [J] . Klesti Hoxha, Artur Baxhaku Cybernetics and information technologies: CIT . 2017,第1期

机译：用于阿尔巴尼亚命名实体识别的自动生成的带注释语料库
3. Quantum Criticism: a Tagged news Corpus Analysed for Sentiment and Named Entities [J] . Ashwini Badgujar, Sheng Cheng, Andrew Wang, Computer Science & Information Technology . 2020,第5期

机译：量子批评：为情绪和命名实体分析了一个标记的新闻语料库
4. Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web [C] . Joohui An, Seungwoo Lee, Gary Geunbae Lee, Meeting of the Association for Computational Linguistics . 2003

机译：自动收购来自万维网的命名实体标记的语料库
5. Using a named entity tagger and a syntactic parser to improve Web-based answer extraction [D] . Kamel, Yasser. 2004

机译：使用命名实体标记器和语法解析器来改进基于Web的答案提取
6. GENETAG: a tagged corpus for gene/protein named entity recognition [O] . Lorraine Tanabe, Natalie Xie, Lynne H Thom, 2005

机译：GENETAG：用于基因/蛋白质命名实体识别的标记语料库
7. Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web [O] . Joohui An, Seungwoo Lee, Gary Geunbae Lee 2003

机译：从万维网自动获取命名实体标记语料库
8. Patent Retrieval in Chemistry based on Semantically Tagged Named Entities [R] . Gurulingappa, H., Mueller, B., Klinger, R., 2009

机译：基于语义标记命名实体的化学专利检索

Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web

摘要

著录项

相似文献

相关主题

期刊订阅