【24h】

Neural Naturalist: Generating Fine-Grained Image Comparisons

机译:神经博物学家:生成细粒度图像比较

获取原文

摘要

We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds. The language collected is highly detailed, while remaining understandable to the everyday observer (e.g., "heart-shaped face," "squat body"). Paragraph-length descriptions naturally adapt to varying levels of taxonomic and visual distance—drawn from a novel stratified sampling approach—with the appropriate level of detail. We propose a new model called Neural Naturalist that uses a joint image encoding and comparative module to generate comparative language, and evaluate the results with humans who must use the descriptions to distinguish real images. Our results indicate promising potential for neural models to explain differences in visual embedding space using natural language, as well as a concrete path for machine learning to aid citizen scientists in their effort to preserve biodiversity.
机译:我们引入了一个由41k个句子组成的新的Birds-to-Words数据集,该数据集描述了鸟类照片之间的细粒度差异。所收集的语言非常详细,而日常观察者仍然可以理解(例如“心形的脸”,“下蹲的身体”)。段落长度的描述自然而然地适应了不同级别的分类和视觉距离(从一种新颖的分层抽样方法中得出),并具有适当的详细程度。我们提出了一种称为神经自然主义者的新模型,该模型使用联合图像编码和比较模块来生成比较语言,并与必须使用描述来区分真实图像的人类一起评估结果。我们的结果表明,神经模型有望使用自然语言来解释视觉嵌入空间中的差异,并且为机器学习提供了一条具体的途径,可以帮助公民科学家保护生物多样性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号