首页> 外文会议>American Society for Information Science and Technology >Automated Indexing of the Hazardous Substances Data Bank
【24h】

Automated Indexing of the Hazardous Substances Data Bank

机译:危险物质数据库的自动索引

获取原文

摘要

The Hazardous Substances Data Bank (HSDB), a factual data file produced and maintained by the Specialized Information Services (SIS) Division of the National Library of Medicine (NLM), contains over 4600 records on potentially hazardous chemicals. To improve information retrieval from HSDB, SIS has undertaken the development of an automated indexing protocol in collaboration with NLM's Indexing Initiative group. The Indexing Initiative investigates methods whereby automated indexing may partially or completely substitute for human indexing. Three main methodologies are applied: the MetaMap Indexing method, which maps text to concepts in the Unified Medical Language System (UMLS) Metathesaurus; the Trigram Phrase Matching method, which uses character trigrams to match text to Metathesaurus concepts; and a variant of the PubMed Related Citations method to find MeSH terms related to input text. The UMLS concepts generated by the first two methods are mapped to MeSH main headings through the Restrict-to-MeSH algorithm. The resulting MeSH terms are then clustered into a ranked list of recommended indexing terms. The purpose of the poster is to present our experience in applying these automated indexing methodologies to a large data file with highly structured records, a variety of text and data formats, and complex technical and biomedical terminology.
机译:该有害物质数据库(HSDB),由专门的信息服务(SIS)美国国家医学图书馆(NLM)的司制作和维护的实际数据文件,包含了超过4600条记录上的潜在危险化学品。为了改善HSDB的信息检索,SIS已经开展了与NLM索引倡议组合作的自动索引协议的开发。索引倡议研究了自动分度的方法可以部分地或完全替代人类索引。应用了三种主要方法:Metamap索引方法,将文本映射到统一的医疗语言系统(UMLS)Metathesaurus中的概念;三字节短语匹配方法,它使用字符三重语来匹配到Metathesaurus概念的文本;和PubMed相关引用方法的变体,以查找与输入文本相关的网格术语。由前两种方法生成的UMLS概念通过限制到网格算法映射到网状主标题。然后将结果的网格术语聚集成排名的推荐索引术语列表中。海报的目的是展示我们在将这些自动索引方法应用于具有高结构化记录的大型数据文件,以及复杂的技术和生物医学术语的大数据文件方面的经验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号