A system and method for hand-gesture recognition are provided. The methodincludesreceiving frames of a media stream of a scene captured from a first personview (FPV) of auser using at least one RGB sensor communicably coupled to a wearable ARdevice. Themedia stream includes RGB image data associated with the frames of the scene.The scenecomprises a dynamic hand gesture performed by the user. A temporal informationassociatedwith the dynamic hand gesture is estimated from the RGB image data by using adeeplearning model. The estimated temporal information is associated with handposes of theuser and comprising a plurality of key-points identified on user's hand in theplurality offrames. Based on the temporal information of the key points, the dynamic handgesture isclassified into at least one predefined gesture class by using a multi-layeredLSTMclassification network.
展开▼