针对因特网上的大规模问答对资源提出一种新的应用,即在问答系统中加入基于百度知道平台构建的大规模问答对库,通过相似度计算,把库中最相似的问题推荐给用户.实验下载网页10500个,成功提取问答对4687个,运用关键词的TF/IDF、树核函数的句法匹配及问句的语义距离3种方法中的一种、两种和三种进行实验,分别获得79.44%,81.67%和88.33%的准确率.结果表明,综合运用多种方法查找相似问题,效果更好.%A kind of new application was proposed towards large-scale Question Answer(QA) pairs resource in this paper. Large-scale QA pairs library based on BaiDu ZhiDao platform was constructed and joined to QA system firstly. Then the question with the highest similarity in the library was recommended to the user by similarity calculation. We downloaded 10500 Web pages in the experiments and extracted 4687 QA pairs successfully. Results of experimental applications utilizing TF/IDF of keywords,syntax match of tree kernel function,semantic distance of sentences synthetically were given to illustrate the proposed technique. The application of our experiments obtained accurate rate by 79. 44%,81. 67% and 88. 33% respectively in terms of using 1,2 or 3 methods abovementioned. The experimental results show that using one more methods synthetically to calculate similarity can acquire more preferable effects.
展开▼