Concept: Gene regulatory network
Reconstructing gene regulatory networks (GRNs) from expression data is one of the most important challenges in systems biology research. Many computational models and methods have been proposed to automate the process of network reconstruction. Inferring robust networks with desired behaviours remains challenging, however. This problem is related to network dynamics but has yet to be investigated using network modeling.
BACKGROUND: Dynamic Bayesian network (DBN) is among the mainstream approaches for modeling various biological networks, including the gene regulatory network (GRN). Most current methods for learning DBN employ either local search such as hill-climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing, which are only able to locate sub-optimal solutions. Further, current DBN applications have essentially been limited to small sized networks. RESULTS: To overcome the above difficulties, we introduce here a deterministic global optimization based DBN approach for reverse engineering genetic networks from time course gene expression data. For such DBN models that consist only of inter time slice arcs, we show that there exists a polynomial time algorithm for learning the globally optimal network structure. The proposed approach, named GlobalMIT+, employs the recently proposed information theoretic scoring metric named mutual information test (MIT). GlobalMIT+ is able to learn high-order time delayed genetic interactions, which are common to most biological systems. Evaluation of the approach using both synthetic and real data sets, including a 733 cyanobacterial gene expression data set, shows significantly improved performance over other techniques. CONCLUSIONS: Our studies demonstrate that deterministic global optimization approaches can infer large scale genetic networks.
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms.
Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction.
Hierarchical generative models, such as Bayesian networks, and belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedforward recognition and feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. However, the complexity required to model cortical processes makes inference, even using approximate methods, very computationally expensive. Thus, existing object perception models based on this approach are typically limited to tree-structured networks with no loops, use small toy examples or fail to account for certain perceptual aspects such as invariance to transformations or feedback reconstruction. In this study we develop a Bayesian network with an architecture similar to that of HMAX, a biologically-inspired hierarchical model of object recognition, and use loopy belief propagation to approximate the model operations (selectivity and invariance). Crucially, the resulting Bayesian network extends the functionality of HMAX by including top-down recursive feedback. Thus, the proposed model not only achieves successful feedforward recognition invariant to noise, occlusions, and changes in position and size, but is also able to reproduce modulatory effects such as illusory contour completion and attention. Our novel and rigorous methodology covers key aspects such as learning using a layerwise greedy algorithm, combining feedback information from multiple parents and reducing the number of operations required. Overall, this work extends an established model of object recognition to include high-level feedback modulation, based on state-of-the-art probabilistic approaches. The methodology employed, consistent with evidence from the visual cortex, can be potentially generalized to build models of hierarchical perceptual organization that include top-down and bottom-up interactions, for example, in other sensory modalities.
An important problem in systems biology is to reconstruct gene regulatory networks (GRNs) from experimental data and other a priori information. The DREAM project offers some types of experimental data, such as knockout data, knockdown data, time series data, etc. Among them, multifactorial perturbation data are easier and less expensive to obtain than other types of experimental data and are thus more common in practice. In this article, a new algorithm is presented for the inference of GRNs using the DREAM4 multifactorial perturbation data. The GRN inference problem among [Formula: see text] genes is decomposed into [Formula: see text] different regression problems. In each of the regression problems, the expression level of a target gene is predicted solely from the expression level of a potential regulation gene. For different potential regulation genes, different weights for a specific target gene are constructed by using the sum of squared residuals and the Pearson correlation coefficient. Then these weights are normalized to reflect effort differences of regulating distinct genes. By appropriately choosing the parameters of the power law, we constructe a 0-1 integer programming problem. By solving this problem, direct regulation genes for an arbitrary gene can be estimated. And, the normalized weight of a gene is modified, on the basis of the estimation results about the existence of direct regulations to it. These normalized and modified weights are used in queuing the possibility of the existence of a corresponding direct regulation. Computation results with the DREAM4 In Silico Size 100 Multifactorial subchallenge show that estimation performances of the suggested algorithm can even outperform the best team. Using the real data provided by the DREAM5 Network Inference Challenge, estimation performances can be ranked third. Furthermore, the high precision of the obtained most reliable predictions shows the suggested algorithm may be helpful in guiding biological experiment designs.
Cybergenetics is a novel field of research aiming at remotely pilot cellular processes in real-time with to leverage the biotechnological potential of synthetic biology. Yet, the control of only a small number of genetic circuits has been tested so far. Here we investigate the control of multistable gene regulatory networks, which are ubiquitously found in nature and play critical roles in cell differentiation and decision-making. Using an in silico feedback control loop, we demonstrate that a bistable genetic toggle switch can be dynamically maintained near its unstable equilibrium position for extended periods of time. Importantly, we show that a direct method based on dual periodic forcing is sufficient to simultaneously maintain many cells in this undecided state. These findings pave the way for the control of more complex cell decision-making systems at both the single cell and the population levels, with vast fundamental and biotechnological applications.
A gene regulatory network (GRN) is a collection of regulatory interactions between transcription factors (TFs) and their target genes. GRNs control different biological processes and have been instrumental to understand the organization and complexity of gene regulation. Although various experimental methods have been used to map GRNs in Arabidopsis thaliana, their limited throughput combined with the large number of TFs makes that for many genes our knowledge about regulating TFs is incomplete. We introduce TF2Network, a tool that exploits the vast amount of TF binding site information and enables the delineation of GRNs by detecting potential regulators for a set of co-expressed or functionally related genes. Validation using two experimental benchmarks reveals that TF2Network predicts the correct regulator in 75-92% of the test sets. Furthermore, our tool is robust to noise in the input gene sets, has a low false discovery rate, and shows a better performance to recover correct regulators compared to other plant tools. TF2Network is accessible through a web interface where GRNs are interactively visualized and annotated with various types of experimental functional information. TF2Network was used to perform systematic functional and regulatory gene annotations, identifying new TFs involved in circadian rhythm and stress response.
Synthetic biology is a promising tool to study the function and properties of gene regulatory networks. Gene circuits with predefined behaviours have been successfully built and modelled, but largely on a case-by-case basis. Here we go beyond individual networks and explore both computationally and synthetically the design space of possible dynamical mechanisms for 3-node stripe-forming networks. First, we computationally test every possible 3-node network for stripe formation in a morphogen gradient. We discover four different dynamical mechanisms to form a stripe and identify the minimal network of each group. Next, with the help of newly established engineering criteria we build these four networks synthetically and show that they indeed operate with four fundamentally distinct mechanisms. Finally, this close match between theory and experiment allows us to infer and subsequently build a 2-node network that represents the archetype of the explored design space.
Deduction of biological regulatory networks from their functions is one of the focus areas of systems biology. Among the different techniques used in this reverse-engineering task, one powerful method is to enumerate all candidate network structures to find suitable ones. However, this method is severely limited by calculation capability: due to the brute-force approach, it is infeasible for networks with large number of nodes to be studied using traditional enumeration method because of the combinatorial explosion. In this study, we propose a new reverse-engineering technique based on the enumerating method: sub-network combinations. First, a complex biological function is divided into several sub-functions. Next, the three-node-network enumerating method is applied to search for sub-networks that are able to realize each of the sub-functions. Finally, complex whole networks are constructed by enumerating all possible combinations of sub-networks. The optimal ones are selected and analyzed. To demonstrate the effectiveness of this new method, we used it to deduct the network structures of a Pavlovian-like function. The whole Pavlovian-like network was successfully constructed by combining robust sub-networks, and the results were analyzed. With sub-network combination, the complexity has been largely reduced. Our method also provides a functional modular view of biological systems.