CeDiRNet-3DoF — Cloth Grasping in 3DoF
Winning method from the Cloth Manipulation and Perception Challange from ICRA 2023
Second place on Cloth Competition from ICRA 2024 with 6-DoF extension
Object grasping is a core challenge in robotics and computer vision, and deformable objects like fabrics present additional complexity due to their non-rigid nature. We present CeDiRNet-3DoF, a deep learning model for grasp point detection that combines center direction regression with a regression of graps pose angle. Our method achieved 1st place in the perception task of the ICRA 2023 Cloth Manipulation Challenge, while outperforming state-of-the-art approaches in our benchmark, including transformer-based models.
A key contribution of our work is the ViCoS Towel Dataset, a large-scale benchmark for cloth grasping research, enabling fair and consistent evaluation across methods.

![]() |
![]() |
![]() |
---|
Key Contributions
- CeDiRNet-3DoF: Robust 3DoF grasp point detection for cloth objects based on CeDiRNet architecture, which is extended to regress orienation of the grasp pose in one dimension.
- ViCoS Towel Dataset:
- 8,000 real-world RGB-D images.
- 12,000 synthetic images (MuJoCo) for pretraining.
- Multiple towel configurations, lighting conditions, backgrounds, and clutter settings.
- Outperformed current state-of-the-art methods in real-world tests.

Code and models
The implementation of CeDiRNet-3DoF is open-source. Find the code and resources/models on GitHub: CeDiRNet-3DoF Repository
Benchmarking
The ViCoS Towel Dataset provides a standardized benchmark for evaluating cloth grasping methods, addressing the lack of consistency in existing literature and enabling direct comparison of different approaches.
Find the benchmark on GitHub: CeDiRNet-3DoF Benchmark
ViCoS Towel Dataset
We introduce ViCoS Towel Dataset for training and benchmarking grasping models, composed of 8000 RGB-D images of towels in various configurations, lighting, and backgrounds. The dataset includes both real-world samples (8000) and synthetic data (12000) generated using MuJoCo physics engine.
Dataset details:
- 10 towels from Household Cloth Objects
- 10 configurations per towel
- 5 backgrounds
- 8 lighting variations
- With and without clutter
Download — Real-world samples (10 GB)
Download — Synthetic/MuJoCo samples (4 GB)

License and citation
The code and datasets are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Please cite our paper published in the IEEE Robotics and Automation Letters when using this model and/or dataset:
@article{Tabernik2024RAL,
author = {Tabernik, Domen and Muhovi{\v{c}}, Jon and Urbas, Matej and Sko{\v{c}}aj, Danijel},
doi = {10.1109/lra.2024.3455802},
issn = {23773766},
journal = {IEEE Robotics and Automation Letters},
number = {10},
pages = {1--8},
publisher = {IEEE},
title = {{Center Direction Network for Grasping Point Localization on Cloths}},
volume = {9},
year = {2024}
}