
EyeSee: Camera to Caption with Attention Mechanism




According to the WHO, there are currently around 2.2 billion people who are either visually impaired or blind in the world. Previously, these people had to rely only on classic aids such as the white cane and the guide dog for mobility and magnifiers and screen readers amongst others for reading. The massive use of smartphones has opened many new possibilities for the visually impaired and blind. They can now use their smartphones to help them navigate around cities and other places. In this project it is proposed to have an app for smartphones which automatically tells the blind user the objects around him. However, automatically identifying and describing the content of an image is not such a simple task. It involves tasks from 2 complex fields namely computer vision and natural language processing. The proposed application, EyeSee, takes images from a real-time environment, processes these frame by frame and tells the user what the image represents. The app also annotates the images with text. The app uses Deep Learning, more specifically, Show, Attend and Tell and GRU.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号