Precision Viticulture (PV) is becoming an active and interdisciplinary research field since it requires solving interesting research issues to concretely answer the demands of specific use cases. A challenging problem in this context is the development of automatic methods for yield estimation. Computer vision methods can contribute to the accomplishment of this task, especially those that can replicate what winemakers do manually. In this paper, an automatic artificial intelligence method for grape bunch detection from RGB images is presented. A customized Convolutional Neural Network (CNN) is employed for pointwise classification of image pixels and the dependence of classification results on the type of input color channels and grapes color properties are studied. The advantage of using additional perception-based input features, such as luminance and visual contrast, is also evaluated, as well as the dependence of the method on the choice of the training set in terms of the amount of labeled data. The latter point has a significant impact on the practical use of the method on-site, its usability by non-expert users, and its adaptability to individual vineyards. Experimental results show that a properly trained CNN can discriminate and detect grape bunches even under uncontrolled acquisition conditions and with limited computational load, making the proposed method implementable on smart devices and suitable for on-site and real-time applications.
Smart farming is becoming an active and interdisciplinary research field as it requires to solve interesting and challenging research issues to respond concretely to the demands of specific use-cases. One of the most delicate tasks is the automatic yield estimation, as for example in vineyards [1]. Computer vision methods that implement the rules of the human visual system can contribute to task accomplishment as they simulate what winemakers make manually [2]. An automatic artificial-intelligence method for grape bunch detection from RGB images is presented. It properly defines the input of a Convolutional Neural Network whose task is the segmentation of grape bunches [3]. The network input consists of pointwise visual contrast-based measurements that allow us to discriminate and detect grape bunches even in uncontrolled acquisition conditions and with limited computational load. The latter property makesthe proposed method implementable on smart devices and appropriate for onsite and real-time applications.
Grape Bunch Detection
Color opponent
Convolutional Neural Network
Human Perception of Visual Information
In this paper, an adaptive method for copy-move forgery detection and localization in digital images is proposed. The method employs wavelet transform with non constant Q factor and characterizes image pixels through the multiscale behavior of corresponding wavelet coefficients. The detection of forged regions is then performed by considering similar those pixels having the same multiscale behavior. The method is pointwise and the length of pixel features vector is image dependent, allowing for a more precise and fast detection of forged regions. The qualitative and quantitative evaluation of the experimental results reveals that the proposed method outperforms some existing transform-based methods in terms of performance and execution time.
The paper presents a method for color quantization (CQ) which uses visual contrast for determining an image-dependent color palette. The proposed method selects image regions in a hierarchical way, according to the visual importance of their colors with respect to the whole image. The method is automatic, image dependent and requires a moderate computational effort. Preliminary results show that the quality of quantized images, measured in terms of Mean Square Error, Color Loss and SSIM, is competitive with some existing CQ approaches.
Color quantization
Human visual system
RGB color space
Visual contrast
This paper focuses on an entropy based formalism to speed up the evaluation of the Structural SIMilarity (SSIM) index in images affected by a global distortion. Looking at images as information sources, a visual distortion typical set can be defined for SSIM. This typical set consists of just a subset of information belonging to the original image and the corresponding one in the distorted version. As side effect, some general theoretical criteria for the computation of any full reference quality assessment measure can be given in order to maximize its computational efficiency. Experimental results on various test images show that the proposed approach allows to estimate SSIM with a considerable speed up (about 200 times) and a small relative error (often lower than 5%).
Information theory
SSIM
Asymptotic equipartition property
Image quality assessment
Typical set
HVAC systems are the largest energy consumers in a building and a clean HVAC system can get about 11% in energy saving. Moreover, particulate pollution represents one of the main causes of cancer death and several health damages. This paper presents an innovative and not invasive procedure for the automatic indoor air quality assessment that depends on HVAC cleaning conditions. It is based on a mathematical algorithm that processes a few on-site physical measurements that are acquired by dedicated sensors at suitable locations with a specif-ic time table. The output of the algorithm is a set of indexes that provide a snapshot of the sys-tem with separated zoom on filters and ducts. The proposed methodology contributes to opti-mize both HVAC maintenance procedures and air quality preservation. Robustness, portability and low implementation costs allow to plan maintenance intervention, limiting it only when standard HVAC working conditions need to be restored.
The paper presents a model for assessing image quality from a subset of pixels. It is based on the fact that human beings do not explore the whole image information for quantifying its degree of distortion. Hence, the vision process can be seen in agreement with the Asymptotic Equipartition Property. The latter assures the existence of a subset of sequences of image blocks able to describe the whole image source with a prefixed and small error. Specifically, the well known Structural SIMilarity index (SSIM) has been considered. Its entropy has been used for defining a method for the selection of those image pixels that enable SSIM estimation with enough precision. Experimental results show that the proposed selection method is able to reduce the number of operations required by SSIM of about 200 times, with an estimation error less than 8%.
Information Theory
SSIM
Image Quality Assessment
Typical Set
This paper focuses on the use the Jensen Shannon divergence for guiding denoising. In particular, it aims at detecting those image regions where noise is masked; denoising is then inhibited where it is useless from the visual point of view. To this aim a reduced reference version of the Jensen Shannon divergence is introduced and it is used for determining a denoising map. The latter separates those image pixels that require to be denoised from those that have to be leaved unaltered. Experimental results show that the proposed method allows to improve denoising performance of some simple and conventional denoisers, in terms of both peak signal to noise ratio (PSNR) and structural similarity index (SSIM). In addition, it can contribute to reduce the computational effort of some performing denoisers, while preserving the visual quality of denoised images.
Computer vision; Signal to noise ratio Computational effort; Image pixels; Image regions; Jensen-Shannon divergence; Peak signal to noise ratio; Reduced reference; Structural similarity indices (SSIM); Visual qualities
This paper presents a methodology for assessing and monitoring the cleaning state of a heating, ventilation, and air conditioning (HVAC) system of a building. It consists of a noninvasive method for measuring the amount of dust in the whole ventilation system, that is, the set of filters and air ducts. Specifically, it defines the minimum amount of measurements, their time table, locations, and acquisition conditions. The proposed method promotes early intervention on the system and it guarantees high indoor air quality and proper HVAC working conditions. The effectiveness of the method is proved by some experimental results on different study cases.
Time-scale transforms play a fundamental role in the compact representation of signals and images
[1]. Non linear time representation provided a significant contribution to the definition of
more flexible and adaptive transforms. However, in many applications signals are better characterized
in the frequency domain. In particular, frequency distribution in the frequency axis is
strictly dependent on the signal under study. On the contrary, frequency axis partition provided
by conventional transforms obeys more rigid rules. It would be then desirable to have a transform
able to adapt to the frequency content of the signal under study, i.e. having a changing
Q factor. The rational dilation wavelet transform [2, 3] (RDWT) is a flexible tool that allows to
change the dilation factor at each step of the transformaswell as the analyzingwindowfunction,
by maintaining the structure and properties of the classical wavelet transform, which is implemented
through perfect reconstruction filter banks. Some examples concerning the way of selecting
significant scales, i.e. central frequencies and bandwidths of the filter bank, in different
applications, including image denoising, deblurring and fusion, will be shown. The properties
of the corresponding adaptive transformwill be also discussed.
wavelet transform
contrast sensitivity function image denoising image deblurring
This paper presents a novel approach for the extraction of the transients content of audio signals, usually represented as superposition of stationary, transient, and stochastic components. The proposed model exploits the predictable and peculiar time-scale behavior of transients by modeling them as superposition of suitable wavelet atoms. These latter allow to predict transients information even at scales where the tonal component is dominant. In this way it is possible to avoid, if required, the pre-analysis of the tonal component. Extensive experimental results show that the proposed model achieves good performances with a moderate computational effort and without any user's dependence.
This letter investigates the possibility of removing noise in correspondence to jump discontinuities using the sorted copy of the signal. It will be proved that sorting makes noise predictable so that it can be reproduced and subtracted from the sorted noisy signal. It will be also shown that the proposed method can substitute for the edge preserving term into an anisotropic diffusion scheme, gaining in terms of mean square error, edge preservation and computational effort.
This correspondence presents a novel approach for translational motion estimation based on the phase of the Fourier transform. It exploits the equality between the averaging of a group of successive frames and the convolution of the reference one with an impulse train function. The use of suitable space filling curves allows to reduce the error in motion estimation making the proposed approach robust under noise. Experimental results show that the proposed approach outperforms available techniques in terms of objective (PSNR) and subjective quality with a lower computational effort.