Collaborating partners
- University of Ljubljana, Faculty of Computer and Information Science
- UL, Faculty of Fine Arts
Funding
- ARRS (P2-0214)
Project aims
Computer vision is becoming a focal problem area of artificial intelligence. In the past, a broader use of computer vision was hampered by image acquisition and by computer processing power, since image and video in particular require huge amounts of memory space and processing. In the past, therefore, problems were often limited to structured, well defined environment. The challenge of present computer vision research is to develop methods and systems which will be able to operate robustly in unstructured environments, where the types of possible objects, situations and tasks can not be defined in advance. We experience a true explosion of visual information. Visual information in digital form can be recorded by any user of a smart mobile phone in shared through Internet. Social networks are becoming more and more visually oriented. On the one side, there is a need for computer vision methods which run on mobile devices, on the other, desk top computers can increase their computing power using graphic processing units (GPU). Fast Internet connections open up the possibility to use remote computing services.
Therefore, researchers are trying to solve harder and harder problem in the sense of fast and robust response to visual information coming from the environment. Machine learning, which was once a domain of symbolic oriented artificial intelligence, is becoming a requirement in computer vision, since only in this way, one can develop robust systems which can operate in unstructured environment. As the machine learning method of choice, deep, learnable models, such as CNNs are gaining pre-eminence. Parallel to development of a CNN based solution for segmentation and reconstruction of superquadrics from 3D point clouds, we would like to develop via transfer learning a CNN that could solve the same task based only on 2D intensity and color images. Since no other method exists for this task, which would be as fast as the proposed CNN-based solution, the proposed goal would be welcome in different application domains, where image processing in real-time is a must, but also where large clouds of 3D data points must be interpreted.
One of the main objectives of the research programme is to go beyond the supervised deep learning and to reduce the need for a large number of training samples. We will achieve this goal by developing several concepts related to the fundamental understanding of deep learning. We will focus on a novel formulation of the deep network structure and on the development of basic methods for semi-supervised, unsupervised, and reinforcement learning. We will address adversarial learning of generative deep networks and develop novel methods that incorporate compositional properties in deep models. Although the computer vision and deep learning community has started investigating approaches that do not require a huge number of labelled training data, the research field still predominantly relies on supervised learning, so that we expect that the results of the proposed research program will have a significant impact to the future development in this research area.
Programme structure and main research lines
- WP1: Visual tracking methodology and evaluation
- WP2: Vision and learning for autonomous robots
- WP3: Beyond classic deep learning
- WP4: Segmentation and modelling of 3D point clouds
- WP5: Biometrics
- WP6: User intefaces applying cameras
VCoS is mainly involved in work in WPs 1, 2, and 3.