Testing Word Similarity: Language Independent Approach with Examples from Romance

机译：测试单词相似度：语言与浪漫示例的语言独立方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Identification of words with the same basic meaning (stemming) has important applications in Information Retrieval, first of all for constructing word frequency lists. Usual morphologically-based approaches (including the Porter stemmers) rely on language-dependent linguistic resources or knowledge, which causes problems when working with multilingual data and multi-thematic document collections. We suggest several empirical formulae with easy to adjust parameters and demonstrate how to construct such formulae for a given language using an inductive method of model self-organization. This method considers a set of models (formulae) of a given class and selects the best ones using training and test samples. We describe the method and give detailed examples for French, Italian, Portuguese, and Spanish. The formulae are examined on real domain-oriented document collections. Our approach can be easily applied to other European languages.

机译：识别具有相同基本含义（Stemming）的单词在信息检索中具有重要的应用，首先用于构建字频率列表。通常的基于形态学的方法（包括Porter Seculmers）依赖于语言依赖语言资源或知识，这在使用多语言数据和多主题文档集合时会导致问题。我们建议使用易于调整参数的若干经验公式，并使用模型自组织的归纳方法演示如何为给定语言构建这种配方。该方法考虑给定类的一组模型（公式），并使用培训和测试样本选择最佳的模型。我们描述了该方法，并为法国，意大利语，葡萄牙语和西班牙语提供详细的例子。在真实的域的文件集合上检查公式。我们的方法可以轻松应用于其他欧洲语言。

著录项

来源
《International Conference on Applications of Natural Language to Information Systems》|2004年||共13页
会议地点
作者
Mikhail Alexandrov; Xavier Blanco; Pavel Makagonov;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-532;
关键词

相似文献

外文文献
中文文献
专利

1. A novel approach for modeling non-keyword intervals in a keyword spotter exploiting acoustic similarities of languages [J] . Heracleous P, Shimizu T Speech Communication . 2005,第4期

机译：一种利用语言的声学相似性在关键字检测器中对非关键字间隔进行建模的新颖方法
2. A language-independent approach to black-box testing using Erlang as test specification language [J] . Laura M. Castro, Miguel A. Francisco The Journal of Systems and Software . 2013,第12期

机译：一种使用Erlang作为测试规范语言的与语言无关的黑盒测试方法
3. Wordform Similarity Increases With Semantic Similarity: An Analysis of 100 Languages [J] . Dautriche Isabelle, Mahowald Kyle, Gibson Edward, Cognitive Science . 2017,第8期

机译：随着语义相似性，字形相似性增加：对100种语言的分析
4. Testing Word Similarity: Language Independent Approach with Examples from Romance [C] . Mikhail Alexandrov, Xavier Blanco, Pavel Makagonov . 2004

机译：测试词的相似性：以浪漫为例的语言独立方法
5. AN ANALYSIS OF WORDS SELECTED BY KINDERGARTEN AND FIRST-GRADE CHILDREN FROM LANGUAGE EXPERIENCE STORIES IN ORDER TO TEST THREE BASIC ASSUMPTIONS OF THE LANGUAGE-EXPERIENCE APPROACH TO TEACHING BEGINNING READING. [D] . MUTHLEB, VERA EVELYN PATE. 1976

机译：对幼儿园和一年级儿童从语言经验故事中选择的单词进行分析，以测试在开始阅读时对语言体验方法的三种基本假设。
6. Understanding the spatial dimension of natural language by measuring the spatial semantic similarity of words through a scalable geospatial context window [O] . Bozhi Wang, Teng Fei, Yuhao Kang, 2020

机译：通过测量通过可扩展的地理空间上下文窗口测量单词的空间语义相似性来了解自然语言的空间维度
7. The generation effect and word learning: a test of the effect of the language experience approach versus the text approach in the acquisition of new reading vocabulary [O] . Czaplicki Christine. 1990

机译：生成效应和单词学习：测试语言体验方法与文本方法在获取新阅读词汇方面的效果

Testing Word Similarity: Language Independent Approach with Examples from Romance

摘要

著录项

相似文献

相关主题

期刊订阅