Research on Extracting Named Entities in Software Engineering Field from Wiki Webpage

机译：Wiki网页软件工程字段中提取名称实体的研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extracting entity concepts from wiki pages is a common way of entity recognition. The common methods for named entity recognition are based on Conditional Random Field (CRF) and rules, such as Harvesting Domain Specic Knowledge Graph from Content of Webpages (HDSKG). However, the features of entity concepts and term phrases in the field of software engineering are not fully considered in HDSKG. To solve the problem, we propose a more efficient algorithm. We first use the webpage title to construct the domain dictionary, and then design the regular rules according to the entity concept features in the software engineering field. Next, the domain dictionary is used to improve the NP chunks in the chunking process. The experimental results show that compared with HDSKG, the proposed algorithm has a significant improvement in the number of entities, accuracy, precision and recall rate.

机译：从Wiki页面中提取实体概念是一个常用的实体识别方式。命名实体识别的常用方法基于条件随机字段（CRF）和规则，例如从网页的内容收获域样本知识图表（HDSKG）。但是，在HDSKG中不完全考虑软件工程领域中的实体概念和术语短语的特征。为了解决问题，我们提出了更有效的算法。我们首先使用网页标题来构建域字典，然后根据软件工程字段中的实体概念功能设计常规规则。接下来，域字典用于改进块流程中的NP块。实验结果表明，与HDSKG相比，该算法的实体数量，准确性，精度和召回率的数量相比，该算法与HDSKG相比。

著录项

来源
《IEEE International Conference on Consumer Electronics - Taiwan》|2019年|710p|共2页
会议地点
作者
JiaPei Guo; Hong Luo; Yan Sun;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TM925-53;
关键词
natural language processing; software engineering; Web sites;

机译：自然语言处理;软件工程;网站;

相似文献

外文文献
中文文献
专利

1. Extracting Named Entities and Relating Them over Time Based on Wikipedia [J] . A. Bhole, B. Fortuna, M. Grobelnik, Informatica: An International Journal of Computing and Informatics . 2007,第4期

机译：基于维基百科，提取命名实体并随时间进行关联
2. Massive parallel sequencing uncovers actionable FGFR2–PPHLN1 fusion and ARAF mutations in intrahepatic cholangiocarcinoma [J] . Daniela Sia, Bojan Losic, Agrin Moeini, Nature Communications . 2015,第1期

机译：大规模并行测序发现可行的 FGFR2 – PPHLN1 融合和 <肝内胆管癌的named-entity> ARAF 突变
3. Dppa3 expression is critical for generation of fully reprogrammed iPS cells and maintenance of Dlk1-Dio3 imprinting [J] . Xingbo Xu, Lukasz Smorag, Toshinobu Nakamura, Nature Communications . 2015,第2016期

机译： Dppa3 表达对于生成完全重新编程的iPS细胞和维护 Dlk1 - Dio3 印记
4. Research on Extracting Named Entities in Software Engineering Field from Wiki Webpage [C] . JiaPei Guo, Hong Luo, Yan Sun IEEE International Conference on Consumer Electronics- Taiwan . 2019

机译：从Wiki网页提取软件工程领域命名实体的研究。
5. Measuring named entity similarity through Wikipedia category hierarchies [D] . Ashman, Jared M. 2010

机译：通过Wikipedia类别层次结构测量命名实体的相似性
6. Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition [O] . Wangjin Lee, Jinwook Choi 2019

机译：前体诱导的条件随机场：通过诱导连接单独的实体以改善临床命名实体的识别
7. Extracting named entities and synonyms from wikipedia [O] . Christian Bøhn 2010

机译：None

Research on Extracting Named Entities in Software Engineering Field from Wiki Webpage

摘要

著录项

相似文献

相关主题

期刊订阅