首页> 外文会议>International Conference on Intelligent Systems and Knowledge Engineering >A Construction Engineering Domain New Word Detection Method with the Combination of BiLSTM-CRF and Information Entropy
【24h】

A Construction Engineering Domain New Word Detection Method with the Combination of BiLSTM-CRF and Information Entropy

机译:结合BiLSTM-CRF和信息熵的建筑工程领域新词检测方法

获取原文

摘要

The study of new word detection is of great significance of the improvement on the performance of Chinese natural language processing tasks. To solve the problem of the inconsistency of coarse-grained long-word boundaries and the detection of compound words in detection of new words, a new word detection method with the combination of BiLSTM-CRF and information entropy(IE) is proposed. First, BiLSTM model extracts candidate new words. Then, information entropy splicing candidate new words to redefine word boundaries. The BiLSTM model could effectively utilize context information, CRF could consider the relationship between adjacent labels, realizing sentence horizontal sequence labeling, which could solve the problem that some compound words and long words are difficult to identify. The results of experiment show that our model achieves better performance on construction engineering datasets.
机译:研究新词检测对提高汉语自然语言处理任务的性能具有重要意义。针对新单词检测中粗粒度长单词边界不一致和复合单词检测不统一的问题,提出了一种结合BiLSTM-CRF和信息熵(IE)的新单词检测方法。首先,BiLSTM模型提取候选新单词。然后,信息熵拼接候选新单词以重新定义单词边界。 BiLSTM模型可以有效地利用上下文信息,CRF可以考虑相邻标签之间的关系,实现句子水平序列标签,从而解决一些复合词和长词难以识别的问题。实验结果表明,该模型在建筑工程数据集上具有较好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号