From the 77 posters that were presented on-site, 3 were selected as the winners after a shortlist round by popular vote, followed by a selection round by the conference organisers.
Resolving noise-control conflict by gene duplication
Authors: Michal Chapal, Sefi Mintzer, Sagie Brodsky, Miri Carmi, Naama Barkai, Weizmann Institute of Science, Israel
Gene duplication promotes adaptive evolution in two principle ways: allowing one duplicate to evolve a new function and resolving adaptive conflicts by splitting ancestral functions between the duplicates. In an apparent departure from both scenarios, low-expressing transcription factor duplicates commonly regulate similar sets of genes and act in overlapping conditions. To examine for possible benefits of such apparently redundant duplicates, we examined the budding yeast duplicated stress regulators Msn2 and Msn4. We show that Msn2,4 indeed function as one unit, inducing the same set of target genes in overlapping conditions, yet this two-factor composition allows its expression to be both environmental-responsive and with low-noise, thereby resolving an adaptive conflict that inherently limits expression of single genes. Our study exemplified a new model for evolution by gene duplication whereby duplicates provide adaptive benefit through cooperation, rather than functional divergence: attaining two-factor dynamics with beneficial properties that cannot be achieved by a single gene.
Deep learning on single-cell ATAC-seq data to decipher enhancer logic
Authors: Ibrahim Ihsan Taskiran, Liesbeth Minnoye, Carmen Bravo Gonzalez-Blas, Sara Aibar Santos, Gert Hulselmans, Valerie Christiaens, Stein Aerts KU Leuven – VIB, Belgium
Single-cell ATAC-seq provides new opportunities to study gene regulation in heterogeneous cell populations such as complex tissues or dynamic processes. We recently developed a probabilistic topic modeling approach, called cisTopic, to predict regulatory topics and sets of co-accessible enhancers from scATAC-seq data. Here, we apply deep learning approaches to analyze these sets of co-accessible enhancers, with the goal to predict the spatiotemporal pattern of enhancer accessibility directly from the enhancer sequence. We trained different types of Artificial Neural Networks, including a hybrid model that combines Convolutional and Recurrent Neural Networks. By applying this approach to a cohort of melanoma patient samples and Drosophila eye disc, we show that key transcription factors can be identified from the convolutional filters. In addition, we use the trained model to analyze the motif architecture in enhancers, such as motif combinations and relationship to nucleosome preferences. We furthermore exploit network explaining methods to predict the impact of somatic mutations, using publicly available SNP databases and in-house whole genome sequencing of inbred fly lines. Currently, to validate our models we are testing (mutated) synthetic cell state specific enhancers using massively parallel enhancer reporter assays (MPRA). In conclusion, training deep learning models on single-cell epigenomics data sets has multiple applications to understand the underlying enhancer logic and decipher gene expression programs.
ROADdt: Regulation network remodeling along disease development trajectories
Authors: Celine Sin, Jörg Menche CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
The human body is comprised of over 200 different cell types varying in size, shape, and function. The differentiation and subsequent maintenance of these different phenotypic states are governed by complex gene regulatory networks that dynamically orchestrate the activation and deactivation of genes. Abnormalities in these networks may lead to dysfunctional expression programs, e.g. uncontrolled cell proliferation. In order to understand the conditions resulting in disease, we must understand the underlying gene regulatory networks governing
the gene expression program. As cells move through the differentiation space, the networks that govern gene regulation are remodeled in order to achieve the appropriate gene expression program. While statistical physics and network theory have demonstrated numerous relationships between the structure of networks and the dynamic processes that act on them,
few studies link these mathematically rigorous principles to gene regulatory networks, none at the level of cell-trajectory-states. The overall goal of this project is to understand the fundamental architecture of gene regulatory networks associated with cell differentiation processes in disease. We hypothesize that the gene regulatory networks of different
cell-trajectory-states along the differentiation trajectory – e.g. transitory, branching, or terminal states – are each characterized by distinct structural features. I will present our first steps in this direction, starting from single-cell RNA seq profiles of tumors. Ultimately, we expect that detailed characterization of the gene regulatory networks in these disease processes will reveal basic principles applicable to other diseases and cell developmental processes.
110 researchers came together at the EMBL Advanced Training Centre in Heidelberg, Germany for 3,5 days of talks, posters and networking. Here we present the work of 4 scientists who received best poster awards at the conference by popular vote.
Engineering portability of the CcaSR light switch for the control of biofilm formation in Pseudomonas putida
Authors: Angeles Hueso-Gil (1), Ákos Nyerges (2), Csaba Pál (2), Belén Calles (1), Victor de Lorenzo (1)
Two of the technical challenges faced by contemporary microbiology involve controlling gene expression using light and regulating bacterial biofilm formation, determined by the intracellular levels of the secondary messenger c-di-GMP. CcaSR system is one of the light switches repeatedly used for transcription induction in Escherichia coli. This two-component system represented a good candidate for its adaptation to Pseudomonas putida. Previous attempts have tried to use this microorganism as chassis for the implementation of new pathways, being biofilm formation an important function to control. To this end, we unified CcaSR components in one single construct and randomly mutagenized their regulatory regions to find a clone with a balanced expression of the system key parts inside P. putida. The combination of this novel mutagenization process with a proper screening, which included a first sorting of the libraries and the later isolation of colonies, lead us to a clone with a much improved induction by green light. The selected variant had a notable capacity in response to green light. Finally, optimized CcaSR was used to control the expression of super-efficient variant of PleD, a diguanylate cyclase of Caulobacter which allowed a tight control of c-di-GMP levels, and therefore, of biofilm production.
Genetic code expansion is a powerful tool to study and control protein function with single-residue precision. It is widely used to e.g. perform labeling for microscopy or to photocontrol proteins. This is achieved by introducing an orthogonal tRNA/synthetase suppressor pair into the host, to recode a stop codon to incorporate a noncanonical amino acid (ncAA) into the nascent chain. This technique is codon-specific, but it cannot select specific mRNAs, so naturally occurring stop codons could be suppressed leading to potential interference with housekeeping translation. Nature avoids cross-talk between cellular processes by confining specific functions into organelles. We aimed to design an organelle dedicated to protein engineering, but as translation is a complex process requiring hundreds of factors to work together, membrane-encapsulation would not be feasible. Inspired by the concept of phase separation we hypothesized that such an organelle could instead be designed membraneless. Phase separation can generate high local concentrations of proteins and RNAs in cells and has recently gained attention owing to its role in the formation of specialized organelles such as nucleoli or stress granules. Despite being membraneless and constantly exchanging with the cytoplasm/nucleoplasm, these organelles still perform complex tasks, such as transcription. We combined phase separating proteins with microtubule motor proteins to generated orthogonally translating organelles in living cells that contain an RNA-targeting system, the stop codon suppression machinery and ribosomes. These large organelles enable site- and mRNA-specific ncAA incorporation, decoding one specific codon exclusively in the mRNA of choice. Our results demonstrate a simple yet effective approach to the generation of semi-synthetic eukaryotic cells containing artificial organelles to harbor two
distinct genetic codes, providing a route towards customized orthogonal translation and protein engineering.
(1) Johannes Gutenberg University Mainz, Germany (2) Institute of Molecular Biology, Germany (3) EMBL Heidelberg, Germany
Metabolic perceptrons for neural computing in biological systems
Amir Pandi (1), Mathilde Koch (1), Peter Voyvodic (2), Paul Soudier (1), Jerome Bonnet (2), Manish Kushwaha (1), Jean-Loup Faulon(1)
Synthetic biological circuits are promising tools for developing sophisticated systems for medical, industrial, and environmental applications. So far, circuit implementations commonly rely on gene expression regulation for information processing using digital logic. Here, we present a new approach for biological computation through metabolic circuits designed by computer-aided tools, implemented in both whole-cell and cell-free systems. We first combine metabolic transducers to build an analog adder, a device that sums up the concentrations of multiple input metabolites. Next, we build a weighted adder where the contributions of the different metabolites to the sum can be adjusted. Using a computational model fitted on experimental data, we finally implement two four-input of metabolite combinations by applying model-predicted weights to the metabolic perceptron. The perceptron-mediated neural computing introduced here lays the groundwork for more advanced metabolic circuits for rapid and scalable multiplex sensing.
The bottom up recreation of cellular processes into synthetic compartments has, in recent years, emerged as an exciting line of research with which to study biological processes in a controlled environment. However, the interior of a living cell is a difficult milieu to mimic in bottom-up synthetic cells, as it is an environment crowded with high concentrations of many different biomacromolecules. In this work, we describe the development of a powerful new tool to more accurately emulate the cell cytosol in discrete coacervate-based protocells. The coacervate core utilized herein not only provides an inherently crowded and highly charged microenvironment, but has also been chemically modified to interact specifically with recombinantly expressed proteins. Our method leverages the well-established binding of His-tagged proteins to Ni2+-nitrilotriacetic acid, which ensures that macromolecules are taken up in a highly efficient, yet gentle manner, thus preserving biological activity. The straightforward method allowed for both control over the amount taken up and an increased local concentration. Moreover, the engineered uptake of proteins was then employed to study two key aspects: the effect of the Ni-NTA interaction on the diffusivity of incorporated proteins, and the enhancement in activity of an encapsulated two-enzyme cascade. This direct and targeted method of protein uptake into a discrete, membrane bound platform is a significant step forward for synthetic cells, and will enable the engineering of highly complex enzyme and signaling networks with increasingly life-like properties.
Poster currently not available
Eindhoven University of Technology, The Netherlands
258 researchers from various fields gathered in Heidelberg last week to listen to 36 talks and engage with 146 poster presenters. Here we present the posters of 5 scientists who received best poster awards at the conference by popular vote.
Benchmarking of multi-omics joint dimensionality reduction (DR) approaches for cancer study
Authors: Laura Cantini (1), Pooya Zakeri (2), Aurelien Naldi (1), Denis Thieffry (1), Elisabeth Remy (3), Anaïs Baudot (2)
Dimensionality Reduction (DR), decomposing data into low-dimensional spaces while preserving most of their information content, is among the most prevalent machine learning techniques in data mining. With the advent of high-throughput technologies, high-dimensional data have become a standard in biology, emphasizing the use of DR. This phenomenon is particularly pronounced in cancer biology, where consortia have profiled thousands of patients for multiple molecular assays (“multi-omics”), including at the emerging single-cell scale. DR approaches have been mainly applied to single omics data leading to cancer subtyping, tumor sub-clones quantification and immune infiltration quantification. Recently, DR approaches designed to jointly analyze multiple omics have been proposed. Integrative DR methods are based on various mathematical assumptions, ranging from extensions of CCA, tensors, or more general data fusion approaches, which makes difficult to chose which method to apply.
In this context, we here in-depth benchmark multi-omics DR approaches using: i) artificial multi-omics cancer data ii) multi-omics bulk data from 10 different cancer types downloaded from TCGA iii) multi-omics single-cell data from cancer cell lines In (i), the capability of the various methods to predict the clustering ground truth was found strongly sensible to the size of the clusters, with intNMF, RGCCA, MCIA and JIVE being the more robust methods. For (ii), MCIA, RGCCA, MOFA and JIVE more consistently identified factors associated to survival, clinical annotations and biological annotations. Finally in (iii), despite never being applied to single-cell data, tICA and MSFA outperformed other methods for their ability to cluster single cells based on their cell line of origin. Overall, our results show that RGCCA, MCIA and JIVE perform consistently better across the three scenarios. This suggests that a mathematical formulation, based on the search of omic-specific factors whose inter-dependence is maximized, better approximates the nature of multi-omics data.
(1) Institut de Biologie de l’Ecole Normale Superieure IBENS, France,(2) Aix Marseille University, INSERM, MMG, CNRS, France,(3) Aix Marseille University, CNRS, France
Single-cell transcriptome and chromatin accessibility data integration reveals cell specific signatures
Authors: Andres Quintero (1), Anne-Claire Kröger (2), Carl Herrmann (2)
The ability to integrate multiple layers of omics data will play an essential role in understanding the complex interplay of different molecular mechanisms that give rise to cellular diversity. In particular, single-cell multi-omics studies provide an enormously valuable source of information, allowing the characterization of different cell states under different biological contexts. However, the integration of distinct cellular modalities to disentangle the regulatory networks and pathways that explain cell identity is still a challenge.Here we introduce Integrative Iterative Non-negative Matrix Factorization (i2NMF), a computational method to dissect cell type associated signatures from multi-omics data sets. i2NMF takes full advantage of data sets with multiple modalities for the same sample or cell, defining cell type-specific features and discerning the shared and specific contribution of each omics type to the identification of different cell types. We applied i2NMF to an early human embryo single-cell multi-omics data set for which scRNA-seq and scATAC-seq profiles were available for every single cell, identifying master transcription factors at the morula and blastocyst stages. Finally, i2NMF is also able to integrate different modalities across multiple experiments. We used this functionality to extract cell-type specific molecular signatures from two complementary datasets of the mouse visual cortex, comprising scATAC-seq and scRNA-seq data. i2NMF was implemented on TensorFlow, presenting a scalable framework and allowing its efficient execution under multiple systems. Our results demonstrate that i2NMF is a useful tool to identify cell-type specific signatures and dissect their underlying molecular features.
(1) German Cancer Research Center (DKFZ), Germany, (2) University Hospital Heidelberg, Germany
Linking signalling and metabolomic footprints with causal networks
Aurélien Dugourd (1), Christoph Kuppe (1), Rafael Kramann (1), Julio Saez-Rodriguez (2)
Renal clear cell carcinomas (RCCC) are the result of a system-wide dysregulation of signaling and metabolic functions originating from multiple factors. Characterizing cellular molecular machineries across multiple omic layers is a very powerful strategy to understand the cellular effects of such dysregulations. In this study, we performed metabolomics and phosphoproteomics from RCCC tissue in comparison to the non-cancerous kidney tissue in a cohort of 20 patients. In order to extract mechanistic information from these observations and to integrate both datasets, we developed a novel analysis pipeline. Phosphoproteomic abundance changes are used to estimate kinase activity changes across patients. Kinase activity estimations are then correlated with metabolite abundance changes. This points at possible interactions between signaling pathways and metabolism. We subsequently build a generic network integrating signaling pathways and metabolic reaction networks based on literature knowledge and databases. We use this signaling/metabolic network to identify paths across kinases and metabolic enzymes to link the correlated kinase activities and metabolites.
This provides potential mechanisms to explain the effect of deregulation of signaling on metabolism. Our approach was able to recover the structure canonical signaling pathway topologies and highlight specific connections between kinases and metabolite abundance deregulated in kidney tumor tissues. This pipeline allows to extract and compare mechanistic
information from metabolomic, phosphoproteomic (and potentially transcriptomic) data across many kidney cancer patients. This information can be used to select potential therapeutic targets to disrupt cancer specific cellular mechanisms, such as the SP1 kinase. Furthermore, the pipeline offers the advantage of being easily transferable in many different biological contexts.
A network-based approach for the identification of multi-omics modules associated with complex human diseases
Authors: Maria Anna Wörheide (1), Jan Krumsiek (2), Gabi Kastenmüller (1), Matthias Arnold (1)
Application of advanced high-throughput omics technologies have provided us with vast amounts of quantitative, highly valuable data. For complex, heterogeneous, and untreatable diseases such as Alzheimer’s disease (AD), the integration of different omics levels and their interconnections is desperately needed to understand the underlying molecular pathomechanisms and identify potential therapeutic targets. However, integrated, multivariable analyses of cross-omics data are not straightforward, and even if successfully applied, often lack a human comprehensible representation. Graph databases provide an intuitive and mathematically well defined framework to store and interconnect diverse biological domains in accessible network structures. Here, we propose a network-based, multi-omics framework
developed with the graph database Neo4j, that allows the large-scale integration and analysis of data on biological entities across omics, as well as results from association analysis with specific (endo) phenotypes. The backbone of this framework comes from known biological relationships and functional/pathway annotations available in public databases. It is augmented with experimental, quantitative data for single omics (e.g. tissue-specific gene expression) and across omics (e.g. eQTLs or mQTLs) derived in population-based studies. To identify modules within this network that are potentially relevant to a disease such as AD, we extend the
framework using large-scale association data for AD (e.g. from case-control GWASs). The resulting network is comprised of over 50 million nodes (entities), representing more than 30 different data types, and more than 80 million edges (relationships). We mined this comprehensive catalogue of biological information using established graph algorithms to
identify potentially disease-related modules of tightly interlinked entities, and were able to obtain several subnetworks significantly enriched for AD-associations.
Recent high-throughput transcription factor (TF) binding assays revealed that TF cooperativity
is a widespread phenomenon. However, we still miss global mechanistic and functional understanding of TF cooperativity. To close this gap we introduce a statistical learning framework that provides structural insight into TF cooperativity and its functional consequences based on next generation sequencing data. We identify DNA shape as driver for cooperativity, with a particularly strong effect for Forkhead-Ets pairs. Follow-up experiments revealed a local shape preference at the Ets-DNA-Forkhead interface and a decreased cooperativity once the interaction is lost. Additionally, we discovered many novel functional associations for cooperatively bound TFs. Examining the novel link between FOXO1:ETV6 and lymphomas revealed that their joint expression levels improve patient survival stratification.
Altogether, our results demonstrate that inter-family cooperative TF binding is driven by position-specific DNA readout mechanisms, which provides an additional regulatory layer for downstream biological functions.
For those of you who have been coming to EMBL for scientific training over the years, you may have noticed that we recently (finally?!) have a new and improved registration and abstract submission software, with a brand new look and feel.
We have moved to an HTML5 software solution, which offers an enhanced customer experience, meaning that we now no longer have browser restrictions or preferred browsers. The interface is fully responsive for submitters and evaluators alike, and is user-friendly on all devices. YAAAAAAY!!!!
The new software is pretty self-explanatory, but just in case you get stuck, here are a couple of how-to videos for abstract and motivation letter submission.
How to submit an abstract – for EMBL conferences and symposia
How to submit a motivation letter – for EMBL courses