List of publications

130 results found

Search by title or abstract

Search by author

Select year

Filter by type

 
2021 Contributo in Atti di convegno restricted access

Data-driven simulation of contagions in public venues

The COVID-19 pandemic triggered a global research effort to define and assess timely and effective containment policies. Understanding the role that specific venues play in the dynamics of epidemic spread is critical to guide the implementation of fine-grained non-pharmaceutical interventions (NPIs). In this paper, we present a new model of context-dependent interactions that integrates information about the surrounding territory and the social fabric. Building on this model, we developed an open-source data-driven simulator of the patterns of fruition of specific gathering places that can be easily configured to project and compare multiple scenarios. We focused on the greatest park of the City of Florence, Italy, to provide experimental evidence that our simulator produces contact graphs with unique, realistic features, and that gaining control of the mechanisms that govern interactions at the local scale allows to unveil and possibly control non-trivial aspects of the epidemic.

epidemics contact networks agent-based data-driven
2021 Articolo in rivista open access

Inferring urban social networks from publicly available data

The definition of suitable generative models for synthetic yet realistic social networks is a widely studied problem in the literature. By not being tied to any real data, random graph models cannot capture all the subtleties of real networks and are inadequate for many practical contexts--including areas of research, such as computational epidemiology, which are recently high on the agenda. At the same time, the so-called contact networks describe interactions, rather than relationships, and are strongly dependent on the application and on the size and quality of the sample data used to infer them. To fill the gap between these two approaches, we present a data-driven model for urban social networks, implemented and released as open source software. By using just widely available aggregated demographic and social-mixing data, we are able to create, for a territory of interest, an age-stratified and geo-referenced synthetic population whose individuals are connected by "strong ties" of two types: Intra-household (e.g., kinship) or friendship. While household links are entirely data-driven, we propose a parametric probabilistic model for friendship, based on the assumption that distances and age differences play a role, and that not all individuals are equally sociable. The demographic and geographic factors governing the structure of the obtained network under different configurations, are thoroughly studied through extensive simulations focused on three Italian cities of different size.

simulator open source data-driven graph model urban social network
2021 Articolo in rivista restricted access

Rayleigh-Bénard convection of a model emulsion: anomalous heat-flux fluctuations and finite-size droplet effects

We present mesoscale numerical simulations of Rayleigh-Bénard (RB) convection in a two-dimensional model emulsion. The systems under study are constituted of finite-size droplets, whose concentration is systematically varied from small (Newtonian emulsions) to large values (non-Newtonian emulsions). We focus on the characterisation of the heat transfer properties close to the transition from conductive to convective states, where it is well known that a homogeneous Newtonian system exhibits a steady flow and a time-independent heat flux. In marked contrast, emulsions exhibit non-steady dynamics with fluctuations in the heat flux. In this paper, we aim at the characterisation of such non-steady dynamics via detailed studies on the time-averaged heat flux and its fluctuations. To quantitatively understand the time-averaged heat flux, we propose a side-by-side comparison between the emulsion system and a single-phase (SP) system, whose viscosity is suitably constructed from the shear rheology of the emulsion. We show that such local closure works well only when a suitable degree of coarse-graining (at the droplet scale) is introduced in the local viscosity. To delve deeper into the fluctuations in the heat flux, we furthermore propose a side-by-side comparison between a Newtonian emulsion (i.e., with a small droplet concentration) and a non-Newtonian emulsion (i.e., with a large droplet concentration), at fixed time-averaged heat flux. This comparison elucidates that finite-size droplets and the non-Newtonian rheology cooperate to trigger enhanced heat-flux fluctuations at the droplet scales. These enhanced fluctuations are rooted in the emergence of space correlations among distant droplets, which we highlight via direct measurements of the droplets displacement and the characterisation of the associated correlation function. The observed findings offer insights on heat transfer properties for confined systems possessing finite-size constituents.

Soft Matter Emulsions Thermal Convection Rheology
2020 Articolo in rivista open access

AMG based on compatible weighted matching on GPUs

GPU version of AMG preconditioner

AMG GPU
2020 Articolo in rivista open access

BootCMatchG: An adaptive Algebraic MultiGrid linear solver for GPUs

Sparse solvers are one of the building blocks of any technology for reliable and high-performance scientific and engineering computing. In this paper we present a software package which implements an efficient multigrid sparse solver running on Graphics Processing Units. The package is a branch of a wider initiative of software development for sparse Linear Algebra computations on emergent HPC architectures involving a large research group working in many application projects over the last ten years.

Adaptive AMG GPUs
2020 Articolo in rivista metadata only access

LBsoft: A parallel open-source software for simulation of colloidal systems

The code is designed to exploit parallel computing platforms, taking advantage also of the recent AVX-512 instruction set. We focus on LBsoft structure, functionality, parallel implementation, performance and availability, so as to facilitate the access to this computational tool to the research community in the field. We present LBsoft, an open-source software developed mainly to simulate the hydro-dynamics of colloidal systems based on the concurrent coupling between lattice Boltzmann methods for the fluid and discrete particle dynamics for the colloids. Such coupling has been developed before, but, to the best of our knowledge, no detailed discussion of the programming issues to be faced in order to attain efficient implementation on parallel architectures, has ever been presented to date. In this paper, we describe in detail the underlying multi-scale models, their coupling procedure, along side with a description of the relevant input variables, to facilitate third-parties usage.

Lattice-Boltzmann Colloids Parallel computing
2020 Articolo in rivista metadata only access

High performance implementations of the 2D Ising model on GPUs

Romero J ; Bisson M ; Fatica M ; Bernaschi M

We present and make available novel implementations of the two-dimensional Ising model that is used as a benchmark to show the computational capabilities of modern Graphic Processing Units (GPUs). The rich programming environment now available on GPUs and flexible hardware capabilities allowed us to quickly experiment with several implementation ideas: a simple stencil-based algorithm, recasting the stencil operations into matrix multiplies to take advantage of Tensor Cores available on NVIDIA GPUs, and a highly optimized multi-spin coding approach. Using the managed memory API available in CUDA allows for simple and efficient distribution of these implementations across a multi-GPU NVIDIA DGX-2 server. We show that even a basic GPU implementation can outperform current results published on TPUs (Yang et al., 2019) and that the optimized multi-GPU implementation can simulate very large lattices faster than custom FPGA solutions (Ortega-Zamorano et al., 2016). Program summary: Program title: cuIsing (optimized). CPC Library link to program files: http://dx.doi.org/10.17632/xrb9xtkbcp.1 Licensing provisions: MIT license. Programming languages: CUDA C, Python. Nature of problem: Two dimensional Ising model for spin systems. Solution method: Checkerboard Metropolis algorithm.

6 5 software including parallel algorithms; 23 statistical physics and thermodynamics; Ising model; GPU programming
2020 Articolo in rivista metadata only access

Strong ergodicity breaking in aging of mean-field spin glasses

Bernaschi Massimo ; Billoire Alain ; Maiorano Andrea ; Parisi Giorgio ; RicciTersenghi Federico

Out-of-equilibrium relaxation processes show aging if they become slower as time passes. Aging processes are ubiquitous and play a fundamental role in the physics of glasses and spin glasses and in other applications (e.g., in algorithms minimizing complex cost/loss functions). The theory of aging in the out-of-equilibrium dynamics of mean-field spin glass models has achieved a fundamental role, thanks to the asymptotic analytic solution found by Cugliandolo and Kurchan. However, this solution is based on assumptions (e.g., the weak ergodicity breaking hypothesis) which have never been put under a strong test until now. In the present work, we present the results of an extraordinary large set of numerical simulations of the prototypical mean-field spin glass models, namely the Sherrington-Kirkpatrick and the Viana-Bray models. Thanks to a very intensive use of graphics processing units (GPUs), we have been able to run the latter model for more than 264 spin updates and thus safely extrapolate the numerical data both in the thermodynamical limit and in the large times limit. The measurements of the two-times correlation functions in isothermal aging after a quench from a random initial configuration to a temperature T < T-c provides clear evidence that, at large times, such correlations do not decay to zero as expected by assuming weak ergodicity breaking. We conclude that strong ergodicity breaking takes place in mean-field spin glasses aging dynamics which, asymptotically, takes place in a confined configurational space. Theoretical models for the aging dynamics need to be revised accordingly.

spin glasses phase transitions off-equilibrium dynamics
2020 Articolo in rivista metadata only access

Toward exascale design of soft mesoscale materials

We provide a brief survey of our current developments in the simulation-based design of novel families of mesoscale porous materials using computational kinetic theory. Prospective applications on exascale computers are also briefly discussed and commented on, with reference to two specific examples of soft mesoscale materials: microfluid crystals and bi-continuous jels.

computational fluid dynamics
2019 Articolo in rivista metadata only access

Kite attack: reshaping the cube attack for a flexible GPU-based maxterm search

Dinur and Shamir's cube attack has attracted significant attention in the literature. Nevertheless, the lack of implementations achieving effective results casts doubts on its practical relevance. On the theoretical side, promising results have been recently achieved leveraging on division trails. The present paper follows a more practical approach and aims at giving new impetus to this line of research by means of a cipher-independent flexible framework that is able to carry out the cube attack on GPU/CPU clusters. We address all issues posed by a GPU implementation, providing evidence in support of parallel variants of the attack and identifying viable directions for solving open problems in the future. We report the results of running our GPU-based cube attack against round-reduced versions of three well-known ciphers: Trivium, Grain-128 and SNOW 3G. Our attack against Trivium improves the state of the art, permitting full key recovery for Trivium reduced to (up to) 781 initialization rounds (out of 1152) and finding the first-ever maxterm after 800 rounds. In this paper, we also present the first standard cube attack (i.e., neither dynamic nor tester) to yield maxterms for Grain-128 up to 160 initialization rounds on non-programmable hardware. We include a thorough evaluation of the impact of system parameters and GPU architecture on the performance. Moreover, we demonstrate the scalability of our solution on multi-GPU systems. We believe that our extensive set of results can be useful for the cryptographic engineering community at large and can pave the way to further results in the area.

Cube attack Algebraic attacks Graphics processing unit
2019 Articolo in rivista metadata only access

Benchmarking multi-GPU applications on modern multi-GPU integrated systems

Bernaschi M ; Agostini E ; Rossetti D

GPUs are very powerful computing accelerators that are often employed in single-device configuration. However, there is a steadily growing interest in using multiple GPUs in a concurrent way both to overcome the memory limitations of the single device and to further reduce execution times. Until recently, communication among GPUs had been carried out mainly by using networking technologies originally devised for standard CPUs with the CPU playing an active role in the communication. However, new alternatives start to be available in which a moderate number of GPUs are directly connected each other by means of proprietary technologies. We present the results of a set of experiments aimed at assessing the performance of some of these hardware/software platforms using a particularly challenging application as a benchmark. We release its source code to facilitate people interested in reproducing or extending our results.

approximate inverse; DGX-1; GPUDirec; POWER9; spin
2019 Articolo in rivista metadata only access

Exploiting multi-level parallelism for stitching very large microscopy images

Bria A ; Bernaschi M ; Guarrasi M ; Iannello G

Due to the limited field of view of the microscopes, acquisitions of macroscopic specimens require many parallel image stacks to cover the whole volume of interest. Overlapping regions are introduced among stacks in order to make it possible automatic alignment by means of a 3D stitching tool. Since state-of-the-art microscopes coupled with chemical clearing procedures can generate 3D images whose size exceeds the Terabyte, parallelization is required to keep stitching time within acceptable limits. In the present paper we discuss how multi-level parallelization reduces the execution times of TeraStitcher, a tool designed to deal with very large images. Two algorithms performing dataset partition for efficient parallelization in a transparent way are presented together with experimental results proving the effectiveness of the approach that achieves a speedup close to 300×, when both coarse- and fine-grained parallelism are exploited. Multi-level parallelization of TeraStitcher led to a significant reduction of processing times with no changes in the user interface, and with no additional effort required for the maintenance of code.

[3D microscopy; stitching; terabyte images; parallel processing; data partitioning; GPU
2019 Articolo in rivista metadata only access

Forensic analysis of Microsoft Skype for Business

Nicoletti M ; Bernaschi M

We present three case studies to illustrate a methodology for conducting forensics investigation on Microsoft Skype for Business. The proposed methodology helps to retrieve information on chat and audio communications made by any account who accessed the PC, to retrieve IP addresses and communication routes for all the participants of a call, and to retrieve forensics evidence to identify the end-user devices of a VoIP call by analyzing the CODECs exchanged by the clients during the SIP (Session Initiation Protocol) handshaking phase. This information may help the investigator either to corroborate or to contradict an investigative hypothesis.

Skype for Business; VolP forensics; SIP forensics; Codecs
2019 Articolo in rivista metadata only access

On the impact of controlled wall roughness shape on the flow of a soft material

We explore the impact of geometrical corrugations on the near-wall flow properties of a soft material driven in a confined rough microchannel. By means of numerical simulations, we perform a quantitative analysis of the relation between the flow rate ? and the wall stress ?w for a number of setups, by changing both the roughness values as well as the roughness shape. Roughness suppresses the flow, with the existence of a characteristic value of ?w at which flow sets in. Just above the onset of flow, we quantitatively analyze the relation between ? and ?w. While for smooth walls a linear dependence is observed, steeper behaviours are found to set in by increasing wall roughness. The variation of the steepness, in turn, depends on the shape of the wall roughness, wherein gentle steepness changes are promoted by a variable space localization of the roughness.

YIELD-STRESS FLUID; LOCAL RHEOLOGY; SLIP; EMULSIONS; MICROGEL
2019 Contributo in Atti di convegno open access

Spiders like Onions: on the Network of Tor Hidden Services

Tor hidden services allow offering and accessing various Internet resources while guaranteeing a high degree of provider and user anonymity. So far, most research work on the Tor network aimed at discovering protocol vulnerabilities to de-anonymize users and services. Other work aimed at estimating the number of available hidden services and classifying them. Something that still remains largely unknown is the structure of the graph defined by the network of Tor services. In this paper, we describe the topology of the Tor graph (aggregated at the hidden service level) measuring both global and local properties by means of well-known metrics. We consider three different snapshots obtained by extensively crawling Tor three times over a 5 months time frame. We separately study these three graphs and their shared "stable" core. In doing so, other than assessing the renowned volatility of Tor hidden services, we make it possible to distinguish time dependent and structural aspects of the Tor graph. Our findings show that, among other things, the graph of Tor hidden services presents some of the characteristics of social and surface web graphs, along with a few unique peculiarities, such as a very high percentage of nodes having no outbound links.

Web Graph Tor Complex Networks Dark Web
2019 Articolo in rivista metadata only access

Mesoscopic simulations at the physics-chemistry-biology interface

This review discusses the lattice Boltzmann-particle dynamics (LBPD) multiscale paradigm for the simulation of complex states of flowing matter at the interface between physics, chemistry, and biology. In particular, current large-scale LBPD simulations of biopolymer translocation across cellular membranes, molecular transport in ion channels, and amyloid aggregation in cells are described. Prospects are provided for future LBPD explorations in the direction of cellular organization, the direct simulation of full biological organelles, all the way up to physiological scales of potential relevance to future precision-medicine applications, such as the accurate description of homeostatic processes. It is argued that. with the advent of Exascale computing, the mesoscale physics approach advocated in this review may come to age in the next decade and open up new exciting perspectives for physics-based computational medicine.

Bioinformatics Biopolymers
2019 Presentazione / Comunicazione non pubblicata (convegno, evento, webinar...) metadata only access

Analysing the Tor Web with High Performance Graph Algorithms

The exploration and analysis of Web graphs has flourished in the recent past, producing a large number of relevant and interesting research results. However, the unique characteristics of the Tor network demand for specific algorithms to explore and analyze it. Tor is an anonymity network that allows offering and accessing various Internet resources while guaranteeing a high degree of provider and user anonymity. So far the attention of the research community has focused on assessing the security of the Tor infrastructure. Most research work on the Tor network aimed at discovering protocol vulnerabilities to de-anonymize users and services, while little or no information is available about the topology of the Tor Web graph or the relationship between pages' content and topological structure. With our work we aim at addressing such lack of information. We describe the topology of the Tor Web graph measuring both global and local properties by means of well-known metrics that require due to the size of the network, high performance algorithms. We consider three different snapshots obtained by extensively crawling Tor three times over a 5 months time frame. Finally we present a correlation analysis of pages' semantics and topology, discussing novel insights about the Tor Web organization and its content. Our findings show that the Tor graph presents some of the character- istics of social and surface web graphs, along with a few unique peculiarities.

Tor Graph Analysis HPC
2019 Articolo in rivista metadata only access

Towards Exascale Lattice Boltzmann computing

We discuss the state of art of Lattice Boltzmann (LB) computing, with special focus on prospective LB schemes capable of meeting the forthcoming Exascale challenge. After reviewing the basic notions of LB computing, we discuss current techniques to improve the performance of LB codes on parallel machines and illustrate selected leading-edge applications in the Petascale range. Finally, we put forward a few ideas on how to improve the communication/computation overlap in current largescale LB simulations, as well as possible strategies towards fault-tolerant LB schemes. (C) 2019 Published by Elsevier Ltd.

computational fluid dynamics
2018 Articolo in rivista metadata only access

Multilevel parallelism for the exploration of large-scale graphs

We present the most recent release of our parallel implementation of the BFS and BC algorithms for the study of large scale graphs. Although our reference platform is a high-end cluster of new generation NVIDIA GPUs and some of our optimizations are CUDA specific, most of our ideas can be applied to other platforms offering multiple levels of parallelism. We exploit multi level parallel processing through a hybrid programming paradigm that combines highly tuned CUDA kernels, for the computations performed by each node, and explicit data exchange through the Message Passing Interface (MPI), for the communications among nodes. The results of the numerical experiments show that the performance of our code is comparable or better with respect to other state-of-the-art solutions. For the BFS, for instance, we reach a peak performance of 200 Giga Teps on a single GPU and 5.5 Terateps on 1024 Pascal GPUs. We release our source codes both for reproducing the results and for facilitating their usage as a building block for the implementation of other algorithms.

Large graphs; graph algorithms; parallel algorithms; parallel programming; distributed programming; GPU; CUDA
2018 Contributo in Atti di convegno open access

Traffic Data: Exploratory Data Analysis with Apache Accumulo

The amount of traffic data collected by automatic number plate reading systems constantly incrseases. It is therefore important, for law enforcement agencies, to find convenient techniques and tools to analyze such data. In this paper we propose a scalable and fully automated procedure leveraging the Apache Accumulo technology that allows an effective importing and processing of traffic data. We discuss preliminary results obtained by using our application for the analysis of a dataset containing real traffic data provided by the Italian National Police. We believe the results described here can pave the way to further interesting research on the matter.

Apache Accumulo Exploratory Data Analysis Traffic Data