Abstract:
Visual Odometry/ Simultaneous Localization and Mapping (VO/ SLAM) and Egocentric hand gesture recognition are the two major technologies for wearable computing devices like AR (Augmented Reality)/ MR (Mixed Reality) glasses. However, the AR/MR community lacks a suitable dataset for developing both hand gesture recognition and RGB-D SLAM methods. In this work, we use a ZED mini Camera to develop challenging benchmarks for RGB-D VO/ SLAM tasks and dynamic hand gesture recognition. In our dataset VOEDHgesture, we collected 264 sequences using a ZED mini camera, along with precisely measured and time-synchronized ground truth camera positions, and manually annotated the bounding box values for the hand region of interest. The sequences comprise both RGB and depth images, captured at HD resolution (1920 × 1080) and recorded at a video frame rate of 30Hz. To resemble the Augmented Reality environment, the sequences are captured using a head-mounted ZED mini camera, with unrestricted