Currently,the word vector-based multi-document summarisation method does not take the order of words in sentences into consideration,it has the problem of same vector in different sentences and the problem of high redundancy in the summaries generated from small-scale training data.To solve these problems,we propose a method based on PV-DM model-based multi-document summarisation method.First,the method formulates the monotone submodular objective function.Then,by training PV-DM model it obtains sentence vectors to calculate the semantic similarity between sentences,and then calculates the monotone submodular objective function.Finally,it uses the optimised algorithm to extract sentences to form summary.Result of experiment on standard dataset Opinosis show that our method outperforms existing mainstream multi-document summarisation method.%当前的基于词向量的多文档摘要方法没有考虑句子中词语的顺序,存在异句同向量问题以及在小规模训练数据上生成的摘要冗余度高的问题。针对这些问题,提出基于 PV-DM(Distributed Memory Model of Paragraph Vectors)模型的多文档摘要方法。该方法首先构建单调亚模(Submodular)目标函数;然后,通过训练 PV-DM模型得到句子向量计算句子间的语义相似度,进而求解单调亚模目标函数;最后,利用优化算法抽取句子生成摘要。在标准数据集 Opinosis 上的实验结果表明该方法优于当前主流的多文档摘要方法。
展开▼