CeDiRNet-3DoF — Cloth Grasping in 3DoF

Winning method from the Cloth Manipulation and Perception Challange from ICRA 2023

Second place on Cloth Competition from ICRA 2024 with 6-DoF extension

Object grasping is a core challenge in robotics and computer vision, and deformable objects like fabrics present additional complexity due to their non-rigid nature. We present CeDiRNet-3DoF, a deep learning model for grasp point detection that combines center direction regression with a regression of graps pose angle. Our method achieved 1st place in the perception task of the ICRA 2023 Cloth Manipulation Challenge, while outperforming state-of-the-art approaches in our benchmark, including transformer-based models.

A key contribution of our work is the ViCoS Towel Dataset, a large-scale benchmark for cloth grasping research, enabling fair and consistent evaluation across methods.

Key Contributions

CeDiRNet-3DoF: Robust 3DoF grasp point detection for cloth objects based on CeDiRNet architecture, which is extended to regress orienation of the grasp pose in one dimension.
ViCoS Towel Dataset:
- 8,000 real-world RGB-D images.
- 12,000 synthetic images (MuJoCo) for pretraining.
- Multiple towel configurations, lighting conditions, backgrounds, and clutter settings.
Outperformed current state-of-the-art methods in real-world tests.

Code and models

The implementation of CeDiRNet-3DoF is open-source. Find the code and resources/models on GitHub: CeDiRNet-3DoF Repository

Benchmarking

The ViCoS Towel Dataset provides a standardized benchmark for evaluating cloth grasping methods, addressing the lack of consistency in existing literature and enabling direct comparison of different approaches.

Find the benchmark on GitHub: CeDiRNet-3DoF Benchmark

ViCoS Towel Dataset

We introduce ViCoS Towel Dataset for training and benchmarking grasping models, composed of 8000 RGB-D images of towels in various configurations, lighting, and backgrounds. The dataset includes both real-world samples (8000) and synthetic data (12000) generated using MuJoCo physics engine.

Dataset details:

10 towels from Household Cloth Objects
10 configurations per towel
5 backgrounds
8 lighting variations
With and without clutter

Download — Real-world samples (10 GB)
Download — Synthetic/MuJoCo samples (4 GB)

License and citation

The code and datasets are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Please cite our paper published in the IEEE Robotics and Automation Letters when using this model and/or dataset:

@article{Tabernik2024RAL,
    author = {Tabernik, Domen and Muhovi{\v{c}}, Jon and Urbas, Matej and Sko{\v{c}}aj, Danijel},
    doi = {10.1109/lra.2024.3455802},
    issn = {23773766},
    journal = {IEEE Robotics and Automation Letters},
    number = {10},
    pages = {1--8},
    publisher = {IEEE},
    title = {{Center Direction Network for Grasping Point Localization on Cloths}},
    volume = {9},
    year = {2024}
}

Vision for robotic manipulation

Researchers