The grasping skill is an indispensable quality for general service robotics. In a home-like natural environment, manipulated objects may be unknown in advance, which prevents the use of a combination of traditional grasp planning and visual pose estimation to realize grasping. Stereo vision is an inexpensive and relatively general sensor for 3-D objects. However, the quality of the sensor data from a stereo camera can be restrictingly low for grasping. This paper proposes an approach to extract parametrized 3-D primitives, which describe an object's overall shape, as well as its location, orientation, and size. These pieces of information are sufficient to grasp the object. Only a stereo image pair is used to generate a partial three-dimensional point cloud, which is then approximated by simple primitives, such as a box or a cylinder. The approach combines initial estimation using RANSAC and further iterative optimization of the unknown parameters. Experiments with real world objects show that the approach can be used to grasp a range of objects using low quality point clouds from single stereo pairs.
展开▼