Other on ViCoS Lab

Approximating Distributions Through Mixtures of Gaussians

Mon, 01 Jan 0001 00:00:00 +0000

Automatic fruit recognition using computer vision

Mon, 01 Jan 0001 00:00:00 +0000

Osrednja tema diplomske naloge je bila analiza primernosti razlicnih algoritmov racunalniskega vida za problem razpoznavanja sadja. Sadje nudi zahtevno domeno za razpoznavanje zaradi svoje raznovrstnosti med sadezi istega razreda, podobnosti sadezev razlicnih razredov in samega stevila razlicnih sadezev. Za uspesno razpoznavanje sadja je bilo potrebno slike opisati z dobrim atributnim zapisom. Informacije o barvi, teksturi, velikosti in obliki sadezev so bile zajete s pomocjo uveljavljenih opisnikov. Klasikacija slik na podlagi atributnega zapisa pridobljenega s pomocjo teh opisnikov je potekala z ze uveljavljenimi klasikacijskimi metodami s podrocja strojnega ucenja. Za uspesnost klasikacijskih metod je bilo potrebno pridobiti veliko in dobro zbirko slik sadja. Ker taksna javno dostopna zbirka slik sadja ne obstoja, jo je bilo potrebno zajeti. Na podlagi analize rezultatov v diplomskem delu je bil zgrajen priporocilni sistem za razpoznavanje sadja, ki je na zahtevni zajeti zbirki slik dosegel kar 85% uspesnost.

Computer vision - CVWW '04 : proceedings of the 9th Computer Vision Winter Workshop

Mon, 01 Jan 0001 00:00:00 +0000

Deep-learning transformer-based sea level modeling ensemble for the Adriatic basin

Mon, 01 Jan 0001 00:00:00 +0000

Storm surges and coastal floods are persistent threats to civil and economic safety in the Northern Adriatic. Meteorologically induced sea level signal is, however, often difficult to forecast deterministically due to the resonant character of the Adriatic basin. A standard solution is therefore resorting to ensembles of numerical ocean models, which are numerically expensive. In recent years, deep-learning-based methods have shown significant potential for numerically cheap alternatives. This is the venue followed in our work. We propose a new deep-learning transformer-based architecture HIDRA-T, a continuation of our recent model HIDRA2 (Rus et al., GMD 2023), which outperformed both state-of-the-art deep-learning network design HIDRA1 and two state-of-the-art numerical ocean models (a NEMO engine and a SCHISM ocean modeling system). HIDRA-T is our latest attempt at sea level forecasting, employing novel transformer-based atmospheric and sea level encoders. Transformers are designed for sequential data, and in HIDRA-T we use self-attention blocks to extract features from the atmospheric data firstly by tokenizing over spatial dimension, then over temporal dimension. HIDRA-T was trained on surface wind and pressure fields from the ECMWF atmospheric ensemble and on Koper tide gauge observations. On an independent and challenging test set, HIDRA-T outperforms all other models, reducing previous best mean absolute forecast error in storm events of HIDRA2 by 2.6 %.

Detekcija točkovnih horizontalnih prometnih znakov, Tehnično poročilo, TR-LUVSS-17/02

Mon, 01 Jan 0001 00:00:00 +0000

Exploring levels of stereo fusion for obstacle detection in marine environment

Mon, 01 Jan 0001 00:00:00 +0000

Fully supervised and point-supervised ship detection using center prediction, LUVSS-2021-11

Mon, 01 Jan 0001 00:00:00 +0000

In monitoring of maritime environment the detection of ships from aerial or satellite images is a common task. Although many fully supervised object detection methods can achieve excellent result on this domain, such methods remain limited by the amount of labeling required to create the training images. In this technical report, we explore novel methods for fully and weakly supervised learning of ship detector from satellite images. We propose a novel dense prediction method for object detection that can be used in fully supervised learning mode to achieve state-of-the-art results, while further modification allows for learning on weakly labeled data such as point-supervision. Point-supervision, where only as single point/pixel on object is known, can be applied to fully automated the learning of ship detection method by using openly available satellite images and known positions of ships from the database of global ship tracking AIS. This makes methods that can be trained from point-supervision highly suitable for ship detection domain.

Guided Video Object Segmentation by Tracking

Mon, 01 Jan 0001 00:00:00 +0000

The paper presents Guided video object segmentation by tracking (gVOST) method for a human-in-the-loop video object segmentation which significantly reduces the manual annotation effort. The method is designed for an interactive object segmentation in a wide range of videos with a minimal user input. User to iteratively selects and annotates a small set of anchor frames by just a few clicks on the object border. The segmentation then is propagated to intermediate frames. Experiments show that gVOST performs well on diverse and challenging videos used in visual object tracking (VOT2020 dataset) where it achieves an IoU of 73% at only 5% of the user annotated frames. This shortens the annotation time by 98% compared to the brute force approach. gVOST outperforms the state-of-the-art interactive video object segmentation methods on the VOT2020 dataset and performs comparably on a less diverse DAVIS video object segmentation dataset.

HIDRA2: deep-learning ensemble sea level and storm tide forecasting in the presence of seiches – the case of the northern Adriatic

Mon, 01 Jan 0001 00:00:00 +0000

We propose a new deep-learning architecture HIDRA2 for sea level and storm tide modeling, which is extremely fast to train and apply and outperforms both our previous network design HIDRA1 and two state-of-the-art numerical ocean models (a NEMO engine with sea level data assimilation and a SCHISM ocean modeling system), over all sea level bins and all forecast lead times. The architecture of HIDRA2 employs novel atmospheric, tidal and sea surface height (SSH) feature encoders as well as a novel feature fusion and SSH regression block. HIDRA2 was trained on surface wind and pressure fields from a single member of the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric ensemble and on Koper tide gauge observations. An extensive ablation study was performed to estimate the individual importance of input encoders and data streams. Compared to HIDRA1, the overall mean absolute forecast error is reduced by 13 %, while in storm events it is lower by an even larger margin of 25 %. Consistent superior performance over HIDRA1 as well as over general circulation models is observed in both tails of the sea level distribution: low tail forecasting is relevant for marine traffic scheduling to ports of the northern Adriatic, while high tail accuracy helps coastal flood response. To assign model errors to specific frequency bands covering diurnal and semi-diurnal tides and the two lowest basin seiches, spectral decomposition of sea levels during several historic storms is performed. HIDRA2 accurately predicts amplitudes and temporal phases of the Adriatic basin seiches, which is an important forecasting benefit due to the high sensitivity of the Adriatic storm tide level to the temporal lag between peak tide and peak seiche.

HIDRA3: A Robust Deep-Learning Model for Multi-Point Sea-Surface Height and Storm Surges Forecasting

Mon, 01 Jan 0001 00:00:00 +0000

Accurate forecasting of storm surges and extreme sea levels is crucial for mitigating coastal flooding and safeguarding communities. While recent advancements have seen machine learning models surpass state-of-the-art physics-based numerical models in sea surface height (SSH) prediction, challenges persist, particularly in areas with limited SSH measurement history and instances of sensor failures. In this study, we developed HIDRA3, a novel deep-learning approach designed to address these challenges by jointly predicting SSH at multiple locations, allowing the training even in the presence of data scarcity and enabling predictions at locations with sensor failures. Compared to the state-of-the-art model HIDRA2 and the numerical model NEMO, HIDRA3 demonstrates notable improvements, achieving, on average, 5.0% lower Mean Absolute Error (MAE) and 11.3% lower MAE on extreme sea surface heights.

HIDRA3: A Robust Deep-Learning Model for Multi-Point Sea-Surface Height Forecasting

Mon, 01 Jan 0001 00:00:00 +0000

Accurate sea surface height (SSH) forecasting is crucial for predicting coastal flooding and protecting communities. Recently, state-of-the-art physics-based numerical models have been outperformed by machine learning models, which rely on atmospheric forecasts and the immediate past measurements obtained from the prediction location. The reliance on past measurements brings several drawbacks. While the atmospheric training data is abundantly available, some locations have only a short history of SSH measurement, which limits the training quality. Furthermore, predictions cannot be made in cases of sensor failure even at locations with abundant past training data. To address these issues, we introduce a new deep learning method HIDRA3, that jointly predicts SSH at multiple locations. This allows improved training even in the presence of data scarcity at some locations and enables making predictions at locations with failed sensors. HIDRA3 surpasses the state-of-the-art model HIDRA2 and the numerical model NEMO, on average obtaining a 5.0% lower Mean Absolute Error (MAE) and an 11.3% lower MAE on extreme sea surface heights.

Improvements of the Adriatic Deep-Learning Sea Level Modeling Network HIDRA

Mon, 01 Jan 0001 00:00:00 +0000

MVL Lab5: Multi-modal Indoor Person Localization Dataset

Mon, 01 Jan 0001 00:00:00 +0000

This technical report describes MVL Lab5, a multi-modal indoor person localization dataset. The dataset contains a sequence of video frames obtained from four calibrated and time-synchronized video cameras and location event data stream from a commercially-available radio-based localization system. The scenario involves five individuals walking around a realistically cluttered room. Provided calibration data and ground truth annotations enable evaluation of person detection, localization and identification approaches. These can be either purely computer-vision based, or based on fusion of video and radio information. This document is intended as the primary documentation source for the dataset, presenting its availability, acquisition procedure, and organization. The structure and format of data is described in detail, along with documentation for bundled Matlab code and examples of its use.

Nadgradnja mere AUC pri analizi klasifikatorjev s krivuljami ROC

Mon, 01 Jan 0001 00:00:00 +0000

Mera AUC, ki se uporablja na področju vrednotenja klasifikatorjev in predstavlja eno glavnih orodij analize ROC, ima določene pomanjkljivosti. Ne upošteva namreč vrednosti točkovnih ocen (angl. scores) primerov, temveč le njihovo razvrstitev. Posledica tega dejstva je njena nezanesljivost pri ocenjevanju množic primerov, pri katerih so razlike med točkovnimi ocenami primerov zanemarljive. Slabost mere AUC pa je tudi njena neinformativnost pri medsebojnem primerjanju množic, ki vsebujejo enako število napak.

Raziskovalci so iz teh razlogov predlagali izboljšave mere AUC, ki upoštevajo tudi vrednosti točkovnih ocen. V tem delu obravnavamo štiri tovrstne mere. Ugotovljeno pa je bilo, da tudi te izpeljanke ne odpravijo vseh slabosti oz. celo vpeljejo nove. Pri njih se namreč lahko pojavi neprimeren vpliv lastnosti obravnavanih množic primerov na obnašanje teh različic.

Sledenje objektov v robotskem nogometu

Mon, 01 Jan 0001 00:00:00 +0000

Understanding Convolutional Neural Networks for Object Recognition

Mon, 01 Jan 0001 00:00:00 +0000

Since deep learning originates from the field of computer vision this talk we will focus more closely on deep learning approaches for computer vision problems. We will focus on convolutional neural networks (CNN or ConvNet), how they work, what makes them particularly useful for computer vision problems, what are the important “tricks” that makes them work that well (ReLU, dropout, batch norm …), and what can visualization of feature tell us about CNNs. The talk will start with the basics of deep learning (gradient descent and back-propagation) so no prior knowledge is needed but some knowledge of mathematics (statistics and derivatives) could be useful for properly understanding more advanced “tricks”. At the end of the talk we will also look at the method being developed at ViCoS lab in UL FRI that tries to advance CNNs by combining them with compositional hierarchies and improve the understanding of features.

Visual Detection of Business Cards: Key-Point Correspondences Filtering

Mon, 01 Jan 0001 00:00:00 +0000

This study explores a coarse localization of a planar object using interest key-points and RANSAC algorithm. The method is employed as part of an application for the detection and recognition of a business card being waved in front of a camera. Localization follows the method of Vincent and Laganiere where RANSAC algorithm is used to find homography between two images that contain a dominant planar regions. RANSAC algorithm and a method for finding planar objects in two consecutive frames are presented in detail, with additional key-point stability over multiple frames being employed for the removal of background key-points. We evaluate the method on four business cards, two non-textured and two textured, and show to significantly reduce background key-points with the dominant planar object. We also show to completely remove background key-points when correspondences are matched on every fifth frame and when each key-point is required to be visible for at least 15 frames.

Visual Detection of Business Cards: Segmentation

Mon, 01 Jan 0001 00:00:00 +0000

This study explores Graph Cut method for the segmentation of business cards captured from a sequence of images. The process of segmentation is needed as a part of an application for detection and recognition of business cards. We explore Graph Cut method in detail, and show it can be applied to business card segmentation. We show how in Graph Cut method foreground and background regions can be successfully initialized using business card key-point detector from our previous study. Furthermore, we show how a sequence of images can be used to improve foreground/background initialization and how by merging multiple frames final segmentation can be improved. We demonstrate our proposed approach on a set of four business cards sequences, using two textured and two nontextured business cards.

Visual Detection of Business Cards: Study of Interest Key-point Detectors

Mon, 01 Jan 0001 00:00:00 +0000

This study examines the use of interest key-point detector for the application of detecting and recognizing business cards that are slowly waved in front of the camera. We focus on two interest point detectors Scale Invariant Feature Transform and Maximally Stable Extremal Regions. Both are presented in detail with main emphasis on SIFT detector. Characteristics of both are examined with respect to the detection of business cards and SIFT is selected as suitable detector for this problem. A stability of SIFT key-points is also experimentally evaluated on a newly created database of business cards for this purpose.