【24h】

The importance of sharing patient-generated clinical speech and language data

机译:共享患者生成的临床言语和语言数据的重要性

获取原文
获取外文期刊封面目录资料

摘要

Increased access to large datasets has driven progress in NLP. However, most computational studies of clinically-validated, patient-generated speech and language involve very few datapoints, as such data are difficult (and expensive) to collect. In this position paper, we argue that we must find ways to promote data sharing across research groups, in order to build datasets of a more appropriate size for NLP and machine learning analysis. We review the benefits and challenges of sharing clinical language data, and suggest several concrete actions by both clinical and NLP researchers to encourage multi-site and multi-disciplinary data sharing. We also propose the creation of a collaborative data sharing platform, to allow NLP researchers to take a more active responsibility for data transcription, annotation, and curation.
机译:对大型数据集的访问越来越多,从而推动了NLP的发展。但是,大多数临床验证的,患者生成的语音和语言的计算研究都涉及很少的数据点,因为此类数据很难(且昂贵)。在本立场文件中,我们认为,我们必须找到促进跨研究组共享数据的方法,以便为NLP和机器学习分析构建更合适的数据集。我们回顾了共享临床语言数据的好处和挑战,并建议临床和NLP研究人员采取若干具体措施来鼓励多站点和多学科数据共享。我们还建议创建一个协作数据共享平台,以使NLP研究人员能够更积极地负责数据的转录,注释和管理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号