首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Classification of Health Webpages as Expert and Non Expert with a Reduced Set of Cross-language Features
【2h】

Classification of Health Webpages as Expert and Non Expert with a Reduced Set of Cross-language Features

机译:具有减少的跨语言功能集的健康网页分类为专家和非专家

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.
机译:区分专家和非专家健康文档可以帮助用户根据他们是否熟悉医学术语来选择更适合他们的信息。这个问题对于信息检索领域尤其重要。在我们的工作中,我们通过文体语料库分析和机器学习算法的应用来实现此目的。我们的假设是,可以基于少量功能执行此区分,并且这些功能可以与语言和领域无关。在源语料库(俄语,糖尿病主题)中获取使用的功能,然后在目标语库(法语,肺炎学主题)和源语料库上进行测试。这些跨语言功能显示90%的精度和93%的源语言非专家文档回想率;目标语言的专家文档具有85%的精度和74%的召回率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号