ViCoS Eye is an experimental online service that aims to demonstrate a state-of-the-art computer vision object detection and categorization algorithm developed in our laboratories. The purpose of the ViCoS Eye is to bring the current state of computer vision research closer to the people and to give us a better insight into the main issues of the algorithms that cannot be noticed by using the standard performance evaluation.
Service is currently available as a simple web-page accessible at eye.vicos.si and also as an Android application which can be downloaded here.
What can it do?
Currently the algorithms behind the web-service are the LHOP model and the HoC descriptor both developed by our team members as part of a research of hierarchical models. Primarily, both algorithms enable visual object categorization and detection therefore this web-service currently supports only those features. In future we would also like to extend the functionality of the service to intelligent content-based image search.
As the algorithms require a training stage in which all necessary visual categories have to be trained, we have currently enabled the service by training it only on a following set of databases. It should be noted that the knowledge of this web-service is limited:
- Categorization with Caltech 101 database from which the following categories perform the best: cup, chair, stop sign, bonsai, butterfly, camera, dollar bill, laptop, faces, scissors (upside oriented only), elephant, soccer ball, ceiling fan, chandelier, pistol
- Detection with only two categories from ETHZ Shape Classes dataset: mug and apple logo.
With time we plan to extend this initial knowledge with new databases and additionally plan to also incorporate user submitted images to improve the overall performance of the service.
How it works
The architecture of the service is shown in this image.
The web-service is implemented as a simple Tornado service that is communicating with the backend using Beanstalkd queues. The backend of the service is running as a Storm topology on top of a distributed set of machines. Currently three server racks each with 27 workers (30 CPU units per machine) provide processing power for Storm topology. The implemented Storm topology is depicted in this image:
The algorithms running in topology are composed of LHOP model, HoC descriptor and LIBSVM Support Vector Machine with an chi-squared distance function. More detailed information about the algorithms can be obtained here and here.
A quick presentation of the associated 18th Computer Vision Winter Workshop (CVWW 2013) paper is available here.