Integrating large-scale web data and curated corpus data in a search engine supporting German literacy education

机译：在支持德国扫盲教育的搜索引擎中集成大型Web数据和策划的语料库数据

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Reading material that is of interest and at the right level for learners is an essential component of effective language educa-tion. The web has long been identified as a valuable source of reading material due to the abundance and variability of materials it offers and its broad range of attractive and current topics. Yet, the web as source of reading material can be problematic in low literacy contexts. We present ongoing work on a hybrid approach to text retrieval that combines the strengths of web search with retrieval from a high-quality, curated corpus re-source. Our system, KANSAS Suche 2.0, supports retrieval and rcranking based on criteria relevant for language learning in three different search modes: unrestricted web search, filtered web search, and cor-pus search. We demonstrate their comple-mentary strengths and weaknesses with re-gard to coverage, readability, and suitabil-ity of the retrieved material for adult lit-eracy and basic education. Wc show that their combination results in a very versa-tile and suitable text retrieval approach for education in the language arts.

机译：阅读对学习者而言有意义且水平合适的材料是有效语言教育的重要组成部分。长期以来，由于提供的材料丰富多样，并且具有广泛的吸引力和当前主题，网络一直被视为阅读材料的宝贵来源。然而，在低素养背景下，网络作为阅读材料的来源可能会出现问题。我们目前正在进行有关混合文本检索方法的工作，该方法结合了网络搜索的优势和从高质量，精选语料库资源中检索的优势。我们的系统KANSAS Search 2.0支持基于与语言学习相关的标准的检索和重新排序，该标准以三种不同的搜索模式进行：无限制的Web搜索，过滤的Web搜索和cor-pus搜索。我们证明了它们的互补优势和劣势，并重新获得了用于成人识字和基础教育的检索材料的覆盖范围，可读性和适用性。 Wc表明，它们的结合为语言艺术教育提供了一种非常通用且合适的文本检索方法。

著录项

来源
《Workshop on natural language processing for computer assisted language learning》|2019年|41-56|共16页
会议地点 Turku(FI)
作者
Sabrina Dittrich; Zarah Weiss; Hannes Schroeter; Detmar Meurers;
展开▼
作者单位

Department of Linguistics University of Tuebingen;

German Institute for Adult Education - Leibniz Centre for Lifelong Learning;

Department of Linguistics University of Tuebingen LEAD Graduate School and Research Network University of Tuebingen;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The Open SESMO (Search Engine & Social Media Optimization) Project: Linked and Structured Data for Library Subscription Databases to Enable Web-scale Discovery in Search Engines [J] . Jason A. Clark, Doralyn Rossmann Journal of web librarianship . 2017,第3a4期

机译：开放式SESMO（搜索引擎和社交媒体优化）项目：图书馆订阅数据库的链接和结构化数据，以在搜索引擎中实现Web规模的发现
2. Intelligence Search Engine and Automatic Integration System for Web-Services and Cloud-Based Data Pro-viders Based on Semantics [J] . Artyom Chernyshov, Anita Balandina, Anastasiya Kostkina, Procedia Computer Science . 2016,第1期

机译：基于语义的Web服务和基于云的数据提供者智能搜索引擎和自动集成系统
3. Using data island method for creating metadata records with indexability and visibility of tag names in web search engines [J] . Sayyed Mahdi Taheri, Nadjla Hariri, Sayyed Rahmatollah Fattahi Library hi tech . 2014,第1期

机译：使用数据岛方法在网络搜索引擎中创建具有可索引性和标签名称可见性的元数据记录
4. Integrating large-scale web data and curated corpus data in a search engine supporting German literacy education [C] . Sabrina Dittrich, Zarah Weiss, Hannes Schroeter, Workshop on natural language processing for computer assisted language learning . 2019

机译：在支持德国识字教育的搜索引擎中集成大型Web数据和策划语料库数据
5. Data Analytics Over a Search Engine's Corpus and Aggregate Suppression [D] . Zhang, Mingyang. 2013

机译：搜索引擎的语料库上的数据分析和总体抑制
6. Rare disease diagnosis: A review of web search social media and large-scale data-mining approaches [O] . Dan Svenstrup, Henrik L Jørgensen, Ole Winther 2015

机译：罕见病诊断：网络搜索社交媒体和大规模数据挖掘方法的回顾
7. I/O-Conscious Data Preparation for Large-Scale Web Search Engines [O] . Maxim Lifantsev, Tzi-cker Chiueh 2002

机译：大规模Web搜索引擎的I / O-Conscious数据准备

Integrating large-scale web data and curated corpus data in a search engine supporting German literacy education

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅