Collaborating partners
- University of Ljubljana, Faculty of Computer and Information Science
- DFG CONSULTING informacijski sistemi d.o.o.
Funding
- ARRS (L2-6765)
Researchers
Project overview
We live in the era of information abundance. However, rather than quantity, the central concern is becoming the quality and credibility of the acquired data. This is especially true for visual information databases. Although the field of computer vision has achieved a significant progress recently, the methods for automatic image interpretation are still not sufficiently reliable for fully autonomous annotation and maintenance of image and video databases (e.g. databases of detected objects). On the other hand, manual annotation of video sequences with relevant objects is very time consuming, expensive, as well as tedious and therefore prone to errors.
In this project we aspired to combine two approaches: computer-based automation of image interpretation that is necessary for database maintenance as well as suitable introduction of a human verifier into the loop. Such combination is of central importance for developing a methodology suitable for semi-automatic maintenance of traffic signalization records, which is partially our project’s practical goal. Even the database of such records for only state roads in the Republic of Slovenia may contain more than 250.000 entries, obtained by processing image sequences along with additional information. Automation is therefore crucial for continuous maintenance of such databases.
The main goal of the project was to develop a framework for semi-supervised incremental learning as well as specific methods for visual learning and recognition that will increase the quality and efficiency of large visual information databases maintenance. We developed efficient methods that address this problem. We based our research on both, hierarchical compositional models and on deep learning methods, as well as on methods that combine the best of both worlds. We also considered different kinds of context, i.e., temporal, spatial as well as semantic context, to narrow down the search area in the images to improve the recognition results. The main research contribution to science has been therefore made in the field of modelling visual information, more specifically in the development of methods for learning object representations for detection and recognition.
The developed methodology was applied to the use case of maintaining the records of traffic signalization, which is very suitable for evaluation of the developed algorithms. For this purpose we also built a comprehensive image database containing annotated traffic signs. We therefore also expect a significant contribution of our research towards improving the efficiency of traffic signalization monitoring that would in the long run significantly reduce the cost of some elements of the traffic infrastructure.