首页> 外国专利> BIG-DATA-BASED ZERO ANAPHORA RESOLUTION METHOD AND APPARATUS, AND DEVICE AND MEDIUM

BIG-DATA-BASED ZERO ANAPHORA RESOLUTION METHOD AND APPARATUS, AND DEVICE AND MEDIUM

机译:基于大数据的零帐道分辨率和装置,以及设备和介质

摘要

A big-data-based zero anaphora resolution method. The method comprises: acquiring a sentence to be resolved and preceding text information thereof, and performing vectorization processing on the sentence to be resolved and the preceding text information thereof so as to obtain a context vector representation of each word in the sentence to be resolved and a context vector representation of each word in the preceding text information (S101); inputting the context vector representation of each word in the sentence to be resolved and that in the preceding text information into a bidirectional long short-term memory network to enhance the context expression and position information of each word, so as to obtain an enhanced context vector representation of each word (S102); traversing the enhanced context vector representation of each word, and according to a parameter vector in a bert model, predicting an anaphora item head word probability and an anaphora item tail word probability of each word (S103); traversing each word, constructing continuous text segments, and according to the anaphora item head word probability and the anaphora item tail word probability of each word, calculating anaphora item probabilities of the continuous text segments (S104); and selecting the continuous text segment with the maximum anaphora item probability from among the continuous text segments as an anaphora item of the sentence to be resolved (S105). The method solves the problems of existing zero anaphora resolution techniques being excessively dependent on an anaphora item candidate set, and a resolution result being low in accuracy and being unstable.
机译:基于大数据的零帐道分辨率方法。该方法包括:获取待解析的句子和其前面的文本信息,并对句子执行矢量化处理被解析和其前面的文本信息,以便获得要解析的句子中的每个单词的上下文向量表示和前面文本信息中的每个单词的上下文向量表示(S101);在要解析的句子中输入每个单词的上下文向量表示,并且在前面的文本信息中转换为双向长期短期存储器网络,以增强每个单词的上下文表达和位置信息,以便获得增强的上下文向量每个单词的表示(S102);遍历每个单词的增强型上下文向量表示,并根据BERT模型中的参数向量,预测每个单词的apaphora项头单词概率和apaphora项尾词概率(S103);遍历每个单词,构建连续文本段,并根据Anaphora项目头单词概率和每个单词的Anaphora项目尾部字概率,计算连续文本段的apaphora项目概率(S104);并从连续文本段中选择具有最大Anaphora项目概率的连续文本段作为要解决的句子的阴道物品(S105)。该方法解决了现有的零附加解析技术的问题过度依赖于阴道物品候选集,并且分辨率结果精度低,并且不稳定。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号