With the current demand for automation in the agro-food industry, accurately detecting and localizing relevant objects in 3D is essential for successful robotic operations. However, this is a challenge due the presence of occlusions. Multi-view perception approaches allow robots to overcome occlusions, but a tracking component is needed to associate the objects detected by the robot over multiple viewpoints. Multi-object tracking (MOT) algorithms can be categorized between two-stage and single-stage methods. Two-stage methods tend to be simpler to adapt and implement to custom applications, while single-stage methods present a more complex end-to-end tracking method that can yield better results in occluded situations at the cost of more training data. The potential advantages of single-stage methods over two-stage methods depend on the complexity of the sequence of viewpoints that a robot needs to process. In this work, we compare a 3D two-stage MOT algorithm, 3D-SORT, against a 3D single-stage MOT algorithm, MOT-DETR, in three different types of sequences with varying levels of complexity. The sequences represent simpler and more complex motions that a robot arm can perform in a tomato greenhouse. Our experiments in a tomato greenhouse show that the single-stage algorithm consistently yields better tracking accuracy, especially in the more challenging sequences where objects are fully occluded or non-visible during several viewpoints.
展开▼
机译:鉴于当前农业食品行业对自动化的需求,准确检测和定位 3D 相关对象对于成功的机器人操作至关重要。但是,由于存在遮挡,这是一个挑战。多视图感知方法允许机器人克服遮挡,但需要一个跟踪组件来关联机器人在多个视点上检测到的物体。多目标跟踪 (MOT) 算法可分为两阶段和单阶段两种方法。两阶段方法往往更容易适应和实施自定义应用程序,而单阶段方法提供了一种更复杂的端到端跟踪方法,可以在遮挡情况下产生更好的结果,但代价是训练数据更多。与两阶段方法相比,单阶段方法的潜在优势取决于机器人需要处理的视点序列的复杂性。在这项工作中,我们在三种不同类型的序列中比较了 3D 两阶段 MOT 算法 3D-SORT 和 3D 单阶段 MOT 算法 MOT-DETR。这些序列代表了机械臂可以在番茄温室中执行的简单和更复杂的运动。我们在番茄温室中的实验表明,单级算法始终产生更好的跟踪精度,尤其是在物体在多个视点完全遮挡或不可见的更具挑战性的序列中。
展开▼