Over the last years, large amounts of molecular profiling data (also called “omics data”) have become available. This has raised hopes to identify so-called disease modules, i.e., sets of functionally related molecules constituting candidate disease mechanisms. However, omics data tend to be overdetermined and noisy; and modules identified via purely statistical means are hence often unstable and functionally uninformative. Hence, network-based disease module mining methods (DMMMs) project omics data onto biological networks such as protein-protein interaction (PPI) networks, gene regulatory networks (GRNs), or microbial interaction networks (MINs). Subsequently, network algorithms are used to identify disease modules consisting of small subnetworks. This dramatically decreases the size of the search space and prioritizes disease modules consisting of functionally related molecules, positively affecting both stability and functional relevance of the discovered modules.
However, to the best of our knowledge, all existing DMMMs are subject to at least one of the following two limitations: Firstly, existing DMMMs are typically supervised, in the sense that they try to find subnetworks explaining differences in the omics data between predefined case and control patients or pre-defined disease subtypes. This is potentially problematic, because it implies that existing DMMMs are biased by our current disease ontologies, which are mostly symptom- or organ-based and therefore often too coarse-grained. For instance, around 95 % of all patients with hypertension are diagnosed with so-called “essential hypertension” (code BA00.Z in the ICD-11 disease ontology), meaning that the cause of the hypertension is unknown. In fact, there are probably several disjoint molecular mechanisms causing “essential hypertension”, and the same holds true for many other complex diseases such as Alzheimer’s disease, multiple sclerosis, and Crohn’s disease. Supervised DMMMs which take existing disease definitions for granted hence risk overlooking the molecular mechanisms causing mechanistically distinct subtypes.
Secondly, most existing DMMMs are designed for static omics data and do not support longitudinal data where the patients’ molecular profiles are observed over time. Existing analysis frameworks for longitudinal omics data largely use purely statistical means. Consequently, network medicine approaches for time series data are needed.
To the best of our knowledge, there are only three DMMMs which, in part, overcome these limitations: BiCoN and GrandForest allow unsupervised disease module mining but do not support longitudinal omics data. TiCoNE supports longitudinal data but requires predefined case vs. control or subtype annotations as input. There is hence an unmet need for unsupervised DMMMs for longitudinal omics data. Developing such methods is the main objective of the proposed project.
The Biomedical Network Science (BIONETS) lab investigates molecular disease mechanisms using techniques from network science, combinatorial optimization, and artificial intelligence. We develop algorithms and tools to mine multi-omics data for such mechanisms and to individuate novel strategies for mechanistically grounded drug repurposing and causally effective treatments of complex diseases. We also develop privacy-preserving decentralized biomedical AI solutions, which enable cross-institutional studies on sensitive data.
Research projects
Current projects
Dimensionalitätsreduktion für molekulare Daten auf der Grundlage der Erklärungskraft differentieller regulatorischer Netzwerke
(Third Party Funds Group – Overall project)
Funding source: Bundesministerium für Bildung und Forschung (BMBF)
Dimensionality reduction for molecular data based on explanatory power of differential regulatory networks – TP A
(Third Party Funds Group – Sub project)
Term: 1. March 2023 - 28. February 2026
Funding source: BMBF / Verbundprojekt
Rapid advances in single-cell RNA sequencing (scRNA-seq) technology are leading to ever-increasing dimensions of the generated molecular data, which complicates data analyses. In NetMap, new scalable and robust dimensionality reduction approaches for scRNA-seq data will be developed. To this end, dimensionality reduction will be integrated into a central task of the systems medicine analysis of scRNA-seq data: inference of gene regulatory networks (GRNs) and driver transcription factors based on cell expression profiles. Each resulting dimension will correspond to a driver GRN, and the coordinate of a cell in this low-dimensional representation will quantify the extent to which the particular driver GRN explains the cell's gene expression profile. These new methods will be implemented as a user-friendly software platform for exploratory expert-in-the-loop analysis and in silico prediction of drug repurposing candidates.
As a case study, we will investigate CD4 helper T cell exhaustion, a potential limiting factor in immunotherapy. NetMap's strategy consists of (1) analyzing phenotypic heterogeneity of depleted CD4 T cells, (2) identifying transcriptional mechanisms that control this heterogeneity, (3) amplifying/eliminating specific subsets and testing their functional impact. This will allow the development of an atlas of the gene regulatory landscape of depleted CD4 T cells, while the in vivo testing of key regulatory transcription factors will help demonstrate the power of the developed methods and allow evaluation and improvement of predictions.
Unsupervised Network Medicine for Longitudinal Omics Data
(FAU Funds)
Over the last years, large amounts of molecular profiling data (also called “omics data”) have become available. This has raised hopes to identify so-called disease modules, i.e., sets of functionally related molecules constituting candidate disease mechanisms. However, omics data tend to be overdetermined and noisy; and modules identified via purely statistical means are hence often unstable and functionally uninformative. Hence, network-based disease module mining methods (DMMMs) project omics data onto biological networks such as protein-protein interaction (PPI) networks, gene regulatory networks (GRNs), or microbial interaction networks (MINs). Subsequently, network algorithms are used to identify disease modules consisting of small subnetworks. This dramatically decreases the size of the search space and prioritizes disease modules consisting of functionally related molecules, positively affecting both stability and functional relevance of the discovered modules.
However, to the best of our knowledge, all existing DMMMs are subject to at least one of the following two limitations: Firstly, existing DMMMs are typically supervised, in the sense that they try to find subnetworks explaining differences in the omics data between predefined case and control patients or pre-defined disease subtypes. This is potentially problematic, because it implies that existing DMMMs are biased by our current disease ontologies, which are mostly symptom- or organ-based and therefore often too coarse-grained. For instance, around 95 % of all patients with hypertension are diagnosed with so-called “essential hypertension” (code BA00.Z in the ICD-11 disease ontology), meaning that the cause of the hypertension is unknown. In fact, there are probably several disjoint molecular mechanisms causing “essential hypertension”, and the same holds true for many other complex diseases such as Alzheimer’s disease, multiple sclerosis, and Crohn’s disease. Supervised DMMMs which take existing disease definitions for granted hence risk overlooking the molecular mechanisms causing mechanistically distinct subtypes.
Secondly, most existing DMMMs are designed for static omics data and do not support longitudinal data where the patients’ molecular profiles are observed over time. Existing analysis frameworks for longitudinal omics data largely use purely statistical means. Consequently, network medicine approaches for time series data are needed.
To the best of our knowledge, there are only three DMMMs which, in part, overcome these limitations: BiCoN and GrandForest allow unsupervised disease module mining but do not support longitudinal omics data. TiCoNE supports longitudinal data but requires predefined case vs. control or subtype annotations as input. There is hence an unmet need for unsupervised DMMMs for longitudinal omics data. Developing such methods is the main objective of the proposed project.
Recent publications
2023
2022
2021
2020
Related Research Fields
Contact: