Implementation of The Common Phrase Index Method on The Phrase Query for Information Retrieval

机译：关于信息检索的短语查询上的公共词组索引方法的实现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As the development of technology, the process of finding information on the news text is easy, because the text of the news is not only distributed in print media, such as newspapers, but also in electronic media that can be accessed using the search engine. In the process of finding relevant documents on the search engine, a phrase often used as a query. The number of words that make up the phrase query and their position obviously affect the relevance of the document produced. As a result, the accuracy of the information obtained will be affected. Based on the outlined problem, the purpose of this research was to analyze the implementation of the common phrase index method on information retrieval. This research will be conducted in English news text and implemented on a prototype to determine the relevance level of the documents produced. The system is built with the stages of pre-processing, indexing, term weighting calculation, and cosine similarity calculation. Then the system will display the document search results in a sequence, based on the cosine similarity. Furthermore, system testing will be conducted using 100 documents and 20 queries. That result is then used for the evaluation stage. First, determine the relevant documents using kappa statistic calculation. Second, determine the system success rate using precision, recall, and F-measure calculation. In this research, the result of kappa statistic calculation was 0.71, so that the relevant documents are eligible for the system evaluation. Then the calculation of precision, recall, and F-measure produces precision of 0.37, recall of 0.50, and Fmeasure of 0.43. From this result can be said that the success rate of the system to produce relevant documents is low.

机译：作为技术的发展，查找新闻文本信息的过程很容易，因为新闻的文本不仅在打印媒体中分发，例如报纸，还可以使用搜索引擎访问的电子媒体。在查找搜索引擎上的相关文档的过程中，通常用作查询的短语。构成短语查询的单词数量显然会影响所产生的文档的相关性。结果，所获得的信息的准确性将受到影响。基于概述的问题，本研究的目的是分析关于信息检索的共同短语索引方法的实施。本研究将在英文新闻文本中进行，并在原型中实施，以确定所产生的文件的相关性水平。系统采用预处理，索引，术语加权计算和余弦相似性计算的阶段构建。然后，系统将基于余弦相似性在序列中显示文档搜索结果。此外，将使用100个文档和20个查询进行系统测试。然后将结果用于评估阶段。首先，使用kappa统计计算确定相关文件。其次，使用精度，召回和F测量计算来确定系统成功率。在这项研究中，Kappa统计计算的结果为0.71，因此相关文件有资格获得系统评估。然后计算精度，召回和F测量，产生0.37的精度，召回0.50，令人恢复为0.43。从这个结果可以说，系统产生相关文件的成功率低。

著录项

来源
《International Conference on Mathematics: Pure, Applied and Computation》|2017年|1 v. (various pagings)|共9页
会议地点
作者
Triyah Fatmawati; Badrus Zaman; Indah Werdiningsih;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 O29-532;
关键词
Implementation; Common Phrase Index Method; Phrase Query;

机译：实施;常见短语索引方法;短语查询;

相似文献

外文文献
中文文献
专利

1. Efficient phrase querying with common phrase index [J] . Matthew Chang, Chung Keung Poon Information Processing & Management . 2008,第2期

机译：具有常用短语索引的高效短语查询
2. A QA document retrieval method based on phrase-level analysis of the input natural language query [J] . Motoyuki Itoh, Masateru Kubodera 電子情報通信学会技術研究報告. 思考と言語. Thought and Language . 2002,第92期

机译：基于输入自然语言查询短语层次分析的QA文档检索方法
3. A QA document retrieval method based on phrase-level analysis of the input natural language query [J] . Motoyuki Itoh, Masateru Kubodera 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第93期

机译：基于输入自然语言查询短语层次分析的QA文档检索方法
4. Implementation of The Common Phrase Index Method on The Phrase Query for Information Retrieval [C] . Triyah Fatmawati, Badrus Zaman, Indah Werdiningsih International Conference on Mathematics: Pure, Applied and Computation . 2017

机译：关于信息检索的短语查询上的公共词组索引方法的实现
5. Understanding what verb phrases and adjective phrases have in common: Evidence from Mandarin alternations [D] . Lam, Charles Tsz-Kwan. 2015

机译：了解动词短语和形容词短语的共同点：来自普通话交替的证据
6. Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine [O] . Boris L. Alperin, Andrey O. Kuzmin, Ludmila Yu. Ilina, 2016

机译：天然语言化学文献的术语谱分析：类词短语检索例程
7. Efficient Phrase Querying with Common Phrase Index ⋆ [O] . Matthew Chang, Chung Keung Poon 2008

机译：使用常用短语索引进行高效短语查询⋆
8. Phrase Dictionary Construction Methods for the R2 Information Retrieval System [R] . Jansen, J. M. 1969

机译：R2信息检索系统的短语词典构造方法

Implementation of The Common Phrase Index Method on The Phrase Query for Information Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅