首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD)

【2h】

Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD)

机译：优化语料库创建以训练低资源域中的单词嵌入：自闭症谱系障碍（ASD）的案例研究

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automating the extraction of behavioral criteria indicative of Autism Spectrum Disorder (ASD) in electronic health records (EHRs) can contribute significantly to the effort to monitor the condition. Word embedding algorithms such as Word2Vec can encode semantic meanings of words in vectors and assist in automated vocabulary discovery from EHRs. However, text available for training word embeddings for ASD is miniscule compared to the billions of tokens typically used. We evaluate the importance of corpus specificity versus size and hypothesize that for specific domains small corpora can generate excellent word embeddings. We custom-built 6 ASD-themed corpora (N=4482), using ASD EHRs and abstracts from PubMed (N=39K) and PsychInfo (N=69K) and evaluated them. We were able to generate the most useful 200-dimension embeddings based on the small ASD EHR data. Due to diversity in its vocabulary, the abstract-based embeddings generated fewer related terms and saw minimal improvement when the size of the corpus increased.

机译：自动提取指示电子健康记录（EHR）中自闭症谱系障碍（ASD）的行为标准可以极大地有助于监测病情。诸如Word2Vec之类的词嵌入算法可以对向量中词的语义进行编码，并有助于从EHR中自动发现词汇。但是，与通常使用的数十亿个令牌相比，可用于训练ASD词嵌入的文本很小。我们评估了语料库特异性对大小的重要性，并假设对于特定领域，小型语料库可以生成出色的词嵌入。我们使用ASD EHR和PubMed（N = 39K）和PsychInfo（N = 69K）的摘要定制了6个以ASD为主题的语料库（N = 4482），并对它们进行了评估。我们可以根据小的ASD EHR数据生成最有用的200维嵌入。由于词汇量的多样性，基于摘要的嵌入产生的相关术语较少，并且随着语料库大小的增加，改进程度很小。

著录项

期刊名称 AMIA Annual Symposium Proceedings
作者
Yang Gu; Gondy Leroy; Sydney Pettygrove; Maureen Kelly Galindo; Margaret Kurzius-Spencer;
展开▼
作者单位

展开▼
年(卷),期 2018(2018),-1
年度 2018
页码 508–517
总页数 10
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-17 16:21:26

相似文献

外文文献
中文文献
专利

1. Examining bidirectional effects between the autism spectrum disorder ( ASD ASD ) core symptom domains and anxiety in children with ASD ASD [J] . Duvekot Jorieke, Ende Jan, Verhulst Frank C., Journal of child psychology and psychiatry . 2018,第3期

机译：检查自闭症谱系疾病（ASD ASD）核心症状结构域与ASD ASD儿童核心症状和焦虑的双向效应
2. Domain-general and domain-specific aspects of temporal discounting in children with ADHD and autism spectrum disorders (ASD): A proof of concept study [J] . DemurieE., RoeyersH., BaeyensD., Research in developmental disabilities . 2013,第6期

机译：ADHD和自闭症谱系障碍（ASD）儿童的时间折现的领域一般和领域特定方面：概念验证研究
3. Reading comprehension, word decoding and spelling in girls with autism spectrum disorders (ASD) or attention-deficit/hyperactivity disorder (AD/HD): performance and predictors. [J] . Asberg J, Kopp S, Berg Kelly K, International journal of language & communication disorders . 2010,第1期

机译：患有自闭症谱系障碍（ASD）或注意力不足/多动症（AD / HD）的女孩的阅读理解，单词解码和拼写：性能和预测指标。
4. Virtual Reality Enabled Training for Social Adaptation in Inclusive Education Settings for School-Aged Children with Autism Spectrum Disorder (ASD) [C] . Horace H.S. Ip, Simpson W.L. Wong, Dorothy F.Y. Chan, International conference on blended learning . 2016

机译：针对自闭症谱系障碍（ASD）学龄儿童的全纳教育环境中的虚拟现实启用的社会适应性培训
5. The Use of Response Interruption and Redirection (RIRD) with Stimulus Control Training for Motor Stereotypy in Children with Autism Spectrum Disorder (ASD) [D] . Shahabuddin, Ambreen. 2018

机译：响应中断与重定向（RIRC）对自闭症谱系（ASD）儿童运动刻板印象的刺激控制训练
6. FASTER and SCOTTEVA trainings for adults with high-functioning autism spectrum disorder (ASD): study protocol for a randomized controlled trial [O] . Ludger Tebartz van Elst, Thomas Fangmeier, Ulrich Max Schaller, 2021

机译：具有高功能自闭症谱系障碍（ASD）的成人更快斯科特和EVA培训：用于随机对照试验的研究方案
7. Domain-general and domain-specific aspects of temporal discounting in children with ADHD and autism spectrum disorders (ASD): a proof of concept study [O] . Demurie Ellen, Roeyers Herbert, Baeyens Dieter, 2013

机译：aDHD和孤独症谱系障碍（asD）儿童时间折扣的领域一般和领域特定方面：概念证明研究
8. Autism and Developmental Disabilities Monitoring Network, 2012. Prevalence of Autism Spectrum Disorders (ASDs) Among Multiple Areas of the United States in 2008 [R] . 2012

机译：自闭症和发育障碍监测网络，2012年。2008年美国多个地区的自闭症谱系障碍（asD）患病率

Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD)

摘要

著录项

相似文献

相关主题

期刊订阅