Research

Our research deals with the analysis of complex systems that can be abstracted as networks or graphs. A central methodological issue underlying our work is how one can study and integrate the multiple levels of organization which are commonly found in a range of systems. Our research combines ‘bottom-up’ dynamical models, and ‘top-down’ data-driven approaches, and uses a blend of tools from control theory, dynamical systems, stochastic processes, machine learning and statistics.

Studying Connections — Networks as Relational Data

Illustration of a simplicial complex

Within network analysis and graph mining, broadly construed, we often focus on relational data. Indeed, much of the main focus of Network Science has initially been to chart and comprehend the connectivity patterns of real-world networks. Stated differently, the data corresponds to the edges (connections) of a network, and we aim to learn about a system by finding patterns in these stochastic connections. Some of the most important areas of Network Science thus include community detection (graph clustering), or the ranking of nodes according to centrality measures such as PageRank, HITS scores, etc.

Within our work we have contributed particularly to the area of community detection, where we have co-developed the Markov stability framework, which leverages random walks to unveil salient patterns in complex networks. More recently, we have also considered approaches for the related problem of finding (symmetry-based) roles in networks, and investigated issues related to the statistical detection of hierarchical arrangements of communities.

In applications, we are often confronted with only one observed network sample, in which the observed network edges are subject to uncertainty. In these situations we thus need to adopt a statistical perspective to account for the uncertainty in the data and reach robust conclusions. A distinctive work from this perspective is our work on using graphons to quantify the uncertainty of centrality measures (such as PageRank) for networks. This departs from classical centrality analyses which almost always assume that the considered networks is given “as is” and there is no uncertainty with respect to the presence of edges. In more recent work we have also investigated how uncertainties arising, e.g., from homophily or group biases can impact centrality measures and can impact network formation processes.

Studying data on networks — dynamics and signals on graphs

Connectivity of a stochastic block model

Understanding how network structure can impact dynamical processes on those networks, e.g., the formation of opinions or the patterns in spiking neurons, is another main theme of our research. Indeed, in many problems emerging from control theory and dynamical systems, we consider a distributed dynamical system supported on a graph structure. The edges are typically defined with low uncertainty and mediate the interactions of a coupled dynamical process on the nodes we aim to understand. Our work focuses on understanding such dynamics and data on networks in a more principled way by taking into account the underlying network structure. For instance, we have studied problems of partial synchronization in neuroscience and coupled oscillator models and linked these phenomena to the presence of certain graph-partitions. Another line of research is linking network partitioning problems, typically considered in unsupervised learning, with concepts from Dynamical Systems and Control Theory. For instance, we researched grouping nodes within a network that have the same dynamical impact on the system as measured by their impulse response. Remarkably, when focusing on diffusion processes on undirected networks, one can recover popular notions of network clustering (e.g., Modularity, Spectral methods) and dimensionality reduction (diffusion maps) from this perspective.

Similarly, fast-growing parts of signal processing and machine learning are concerned with data supported on the nodes of a graph. Here the graph is considered a fixed entity, which is either given, or constructed from (high-dimensional) data, and our primary goal is to leverage the graph structure to make inferences about data supported on the nodes. Graph-based semi-supervised learning is a concrete example: here, the graph structure is utilized to extrapolate from a few known node labels to all node (data point) labels, by enforcing smoothness along the graph edges. Most prominently, however, the utility of graph-based learning has been exhibited in the recent rise of graph neural networks. We have contributed both to the area of graph signal processing and graph-neural networks in particular by focussing on extensions of these techniques to analyze flows defined on the edges of graphs (or more generally complexes). Furthermore, we have investigated issues related to the robustness of graph neural networks (in particular, related to heterophily) as well as fairness.

Studying high-dimensional data via graphs and complexes — topological data analysis

Connectivity of a stochastic block model

High-dimensional (point-cloud) data commonly contains some geometric or topological information. Topological tools enable the classification of topological or geometric structure, for instance, the detection of possible clusters in the data. A central notion for the analysis of such point cloud data is often that the observed data is assumed to be a noisy sample from a low-dimensional topological or geometric object, such as a manifold embedded in a larger sample space. To reveal the shape of this manifold, we can construct a graph that serves as a discrete proxy for this continuous object: The vertices correspond to the data points, and two vertices are connected by an edge if the distance between the corresponding data points is sufficiently small. The resulting (family of) graphs can then be analyzed to provide a low-dimensional representation of the data, e.g., by considering distance measures on the graph, rather than in the ambient space of the data. The idea of approximating the global geometric structure of high-dimensional data using graphs naturally generalizes beyond local pairwise geometric relationships to local multi-way geometric relationships, leading to the definition of (simplicial and cellular) complexes, whose analysis can reveal important properties about the underlying data — which is the central tenet of topological data analysis. The complexes constructed in this way also give rise to a hierarchy of Hodge-Laplacians, which include the graph Laplacian matrix as a special case, and can be used to extract geometric and topological information about the data. We have introduced several techniques to extend the spectral analysis of these Hodge-Laplacians for point cloud data, thereby bridging ideas from persistent homology with the interpretability of classical spectral clustering.

We have also laid foundations for using simplicial complexes for the analysis of complex systems and relational data (perspective A), by translating the problem of link-prediction to the context of simplicial complexes. We further introduced a notion of diffusion for simplicial complexes, which has enabled us to extend concepts such as PageRank to the domain of simplicial complexes, and have now started to explore further applications of simplicial complexes in the context of dynamical systems and control.

Interdisciplinary Applications

Illustrations of protein, neuron spike trains, EU road network

One of the strengths of network models is their versatility: networks have become ubiquitous abstractions for social, physical, biological and engineered systems. Facilitated by our multidisciplinary background, we work on a range of applications. We are particularly interested in biological problems, such as the analysis of neurophysiological data or the analysis of proteins and genetic data, and on applications concerning social systems such as opinion formation in social networks and modelling systematic bias.

Read More