首页> 美国卫生研究院文献>other >A Multi-Label Classification Approach for Coding Cancer Information Service Chat Transcripts
【2h】

A Multi-Label Classification Approach for Coding Cancer Information Service Chat Transcripts

机译:编码癌症信息服务聊天记录的多标签分类方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

National Cancer Institute's (NCI) Cancer Information Service (CIS) offers online instant messaging based information service called LiveHelp to patients, family members, friends, and other cancer information consumers. A cancer information specialist (IS) ‘chats’ with a consumer and provides information on a variety of topics including clinical trials. After a LiveHelp chat session is finished, the IS codes about 20 different elements of metadata about the session in electronic contact record forms (ECRF), which are to be later used for quality control and reporting. Besides straightforward elements like age and gender, more specific elements to be coded include the purpose of contact, the subjects of interaction, and the different responses provided to the consumer, the latter two often taking on multiple values. As such, ECRF coding is a time consuming task and automating this process could help ISs to focus more on their primary goal of helping consumers with valuable cancer related information. As a first attempt in this task, we explored multi-label and multi-class text classification approaches to code the purpose, subjects of interaction, and the responses provided based on the chat transcripts. With a sample dataset of about 673 transcripts, we achieved example-based F-scores of 0.67 (for subjects) and 0.58 (responses). We also achieved label-based micro F-scores of 0.65 (for subjects), 0.62 (for responses), and 0.61 (for purpose). To our knowledge this is the first attempt in automatic coding of Live-Help transcripts and our initial results on the smaller corpus indicate promising future directions in this task.
机译:美国国家癌症研究所(NCI)的癌症信息服务(CIS)为患者,家人,朋友和其他癌症信息消费者提供基于在线即时消息的信息服务,称为LiveHelp。癌症信息专家(IS)与消费者进行“聊天”,并提供包括临床试验在内的各种主题的信息。 LiveHelp聊天会话结束后,IS会以电子联系人记录形式(ECRF)编码有关该会话的元数据的大约20个不同元素,这些元素随后将用于质量控制和报告。除了年龄和性别等直截了当的元素外,要编码的更具体的元素包括联系的目的,互动的主题以及提供给消费者的不同响应,后两者通常具有多种价值。因此,ECRF编码是一项耗时的任务,并且使该过程自动化可以帮助IS将更多精力集中在他们的主要目标上,即为消费者提供有价值的癌症相关信息。作为此任务的首次尝试,我们探索了多标签和多类文本分类方法来对目的,交互主题以及基于聊天记录的响应进行编码。借助大约673个成绩单的样本数据集,我们获得了基于示例的F分数,分别为0.67(对于受试者)和0.58(响应)。我们还获得了基于标签的微型F分数0.65(针对受试者),0.62(针对响应)和0.61(针对目的)。据我们所知,这是对Live-Help成绩单进行自动编码的首次尝试,而我们在较小语料库上的初步结果表明该任务的前景广阔。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号