Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement

RAJU BALAKRISHNAN; SUBBARAO KAMBHAMPATI; MANISHKUMAR JHA

首页> 外文期刊>ACM transactions on the web >Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement

【24h】

Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement

机译：基于源间协议评估深层Web源和结果的相关性和信任度

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Deep web search engines face the formidable challenge of retrieving high-quality results from the vast collection of searchable databases. Deep web search is a two-step process of selecting the high-quality sources and ranking the results from the selected sources. Though there are existing methods for both the steps, they assess the relevance of the sources and the results using the query-result similarity. When applied to the deep web these methods have two deficiencies. First is that they are agnostic to the correctness (trustworthiness) of the results. Second, the query-based relevance does not consider the importance of the results and sources. These two considerations are essential for the deep web and open collections in general. Since a number of deep web sources provide answers to any query, we conjuncture that the agreements between these answers are helpful in assessing the importance and the trustworthiness of the sources and the results. For assessing source quality, we compute the agreement between the sources as the agreement of the answers returned. While computing the agreement, we also measure and compensate for the possible collusion between the sources. This adjusted agreement is modeled as a graph with sources at the vertices. On this agreement graph, a quality score of a source, that we call SourceRank, is calculated as the stationary visit probability of a random walk. For ranking results, we analyze the second-order agreement between the results. Further extending SourceRank to multidomain search, we propose a source ranking sensitive to the query domains. Multiple domain-specific rankings of a source are computed, and these ranks are combined for the final ranking. We perform extensive evaluations on online and hundreds of Google Base sources spanning across domains. The proposed result and source rankings are implemented in the deep web search engine Factal. We demonstrate that the agreement analysis tracks source corruption. Further, our relevance evaluations show that our methods improve precision significantly over Google Base and the other baseline methods. The result ranking and the domain-specific source ranking are evaluated separately.

机译：深度网络搜索引擎面临着巨大的挑战，即要从大量可搜索的数据库中检索高质量的结果。深度网络搜索是一个两步过程，即选择高质量的源并对来自所选源的结果进行排名。尽管这两个步骤都有现有的方法，但是它们使用查询结果相似性来评估源和结果的相关性。当应用于深层网络时，这些方法有两个缺陷。首先，它们与结果的正确性（可信度）无关。其次，基于查询的相关性不考虑结果和来源的重要性。这两个注意事项通常对于深度网络和开放收藏至关重要。由于许多深层网络资源提供了对任何查询的答案，因此我们认为，这些答案之间的协议有助于评估资源和结果的重要性和可信赖性。为了评估来源质量，我们计算来源之间的一致性，作为返回答案的一致性。在计算协议时，我们还测量并补偿源之间可能的串通。将该调整后的协议建模为在顶点处具有源的图形。在此一致性图上，将源的质量得分（我们称为SourceRank）计算为随机游动的静态访问概率。对于结果排名，我们分析结果之间的二阶一致性。进一步将SourceRank扩展到多域搜索，我们提出了对查询域敏感的源排名。计算源的多个特定于域的排名，并将这些排名合并以得出最终排名。我们对在线和跨域的数百个Google Base来源进行了广泛的评估。建议的结果和来源排名在深度网络搜索引擎Factal中实现。我们证明协议分析可以跟踪源腐败。此外，我们的相关性评估表明，与Google Base和其他基准方法相比，我们的方法显着提高了精度。结果排名和特定领域的源排名分别进行评估。

著录项

来源
《ACM transactions on the web》 |2013年第2期|11.1-11.32|共32页
作者
RAJU BALAKRISHNAN; SUBBARAO KAMBHAMPATI; MANISHKUMAR JHA;
展开▼
作者单位

Arizona State University;

Arizona State University;

Arizona State University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep web search; web trust; source rank; web database search; deep web integration; database integration; agreement analysis;

机译：深度网络搜索;网络信任;来源等级网络数据库搜索;深度网络集成;数据库集成;协议分析;

相似文献

外文文献
中文文献
专利

1. Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources [J] . Cybernetics and information technologies: CIT . 2019,第2期

机译：优化的聚焦Web爬虫，基于自然语言处理的基于生物信息学网源的相关性测量
2. Assessing Trust in E-Commerce Website based on Ranking of Trust Attributes [J] . Muhammad Rushdi Rusli, Razak Che-Hussin, Halina Mohamed Dahlan WSEAS Transactions on Business and Economics . 2010,第1a4期

机译：基于信任属性的排名评估电子商务网站中的信任
3. Food webs of the Paraná River floodplain: Assessing basal sources using stable carbon and nitrogen isotopes [J] . Mercedes Rosa Marchese, Miguel Saigo, Florencia Lucila Zilli, Limnologica . 2014,第Null期

机译：巴拉那河漫滩的食物网：使用稳定的碳和氮同位素评估基础来源
4. SourceRank: Relevance and Trust Assessment for Deep Web Sources Based on Inter-Source Agreement [C] . Raju Balakrishnan, Subbarao Kambhampati 19th international world wide web conference 2010 . 2010

机译：SourceRank：基于源间协议的深层Web源的相关性和信任评估
5. Trust and Profit Sensitive Ranking for the Deep Web and On-line Advertisements [D] . Balakrishnan, Raju 2012

机译：深度网络和在线广告的信任度和利润敏感性排名
6. A methodological framework for assessing agreement between cost-effectiveness outcomes estimated using alternative sources of data on treatment costs and effects for trial-based economic evaluations [O] . Felix Achana, Stavros Petrou, Kamran Khan, -1

机译：一种方法框架用于评估成本效益结果之间的一致性该结果使用有关治疗成本的替代数据来源和基于试验的经济评估的效果进行估算
7. A Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement [O] . 2013

机译：基于源间协议评估Deep Web源和结果的相关性和信任度
8. Assessing Information Trustability in a Secure Web Services Environment [R] . Penner, C. G. 2005

机译：评估安全Web服务环境中的信息可信性

Assessing Relevance and Trust of the Deep Web Sources and Results Based on Inter-Source Agreement

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅