Systems biology of plant molecular networks: from networks to models

F.L. Valentim

Research output: Thesisinternal PhD, WU


Developmental processes are controlled by regulatory networks (GRNs), which are tightly

coordinated networks of transcription factors (TFs) that activate and repress gene expression

within a spatial and temporal context. In Arabidopsis thaliana, the key components and network

structures of the GRNs controlling major plant reproduction processes, such as floral transition

and floral organ identity specification, have been comprehensively unveiled. This thanks to

advances in ‘omics’ technologies combined with genetic approaches. Yet, because of the

multidimensional nature of the data and because of the complexity of the regulatory

mechanisms, there is a clear need to analyse these data in such a way that we can understand

how TFs control complex traits. The use of mathematical modelling facilitates the

representation of the dynamics of a GRN and enables better insight into GRN complexity; while

multidimensional data analysis enables the identification of properties that connect different

layers from genotype-to-phenotype. Mathematical modelling and multidimensional data

analysis are both parts of a systems biology approach, and this thesis presents the application of

both types of systems biology approaches to flowering GRNs.

Chapter 1 comprehensively reviews advances in understanding of GRNs underlying plant

reproduction processes, as well as mathematical models and multidimensional data analysis

approaches to study plant systems biology. As discussed in Chapter 1, an important aspect of

understanding these GRNs is how perturbations in one part of the network are transmitted to

other parts, and ultimately how this results in changes in phenotype. Given the complexity of

recent versions of Arabidopsis GRNs - which involves highly-connected, non-linear networks

of TFs, microRNAs, movable factors, hormones and chromatin modifying proteins - it is not

possible to predict the effect of gene perturbations on e.g. flowering time in an intuitive way by

just looking at the network structure. Therefore, mathematical modelling plays an important role

in providing a quantitative understanding of GRNs. In addition, aspects of multidimensional

data analysis for understanding GRNs underlying plant reproduction are also discussed in the

first Chapter. This includes not only the integration of experimental data, e.g. transcriptomics

with protein-DNA binding profiling, but also the integration of different types of networks

identified by ‘omics’ approaches, e.g. protein-protein interaction networks and gene regulatory


Chapter 2 describes a mathematical model for representing the dynamics of key genes in the

GRN of flowering time control. We modelled with ordinary differential equations (ODEs) the

physical interactions and regulatory relationships of a set of core genes controlling Arabidopsis

flowering time in order to quantitatively analyse the relationship between their expression levels

and the flowering time response. We considered a core GRN composed of eight TFs: SHORT



FLOWERING LOCUS T (FT), LEAFY (LFY) and FD. The connections and interactions

amongst these components are justified based on experimental data, and the model is

parameterised by fitting the equations to quantitative data on gene expression and flowering

time. Then the model is validated with transcript data from a range of mutants. We verify that

the model is able to describe some quantitative patterns seen in expression data under genetic

perturbations, which supported the credibility of the model and its dynamic properties. The

proposed model is able to predict the flowering time by assessing changes in the expression of

the orchestrator of floral transition AP1. Overall, the work presents a framework, which allows

addressing how different quantitative inputs are combined into a single quantitative output, i.e.

the timing of flowering. The model allowed studying the established genetic regulations, and we

discuss in Chapter 5 the steps towards using the proposed framework to zoom in and obtain new

insides about the molecular mechanisms underlying the regulations.

Systems biology does not only involve the use of dynamic modelling but also the development

of approaches for multidimensional data analysis that are able to integrate multiple levels of

systems organization. In Chapter 3, we aimed at comprehensively identifying and characterizing

cis-regulatory mutations that have an effect on the GRN of flowering time control. By using

ChIP-seq data and information about known DNA binding motifs of TFs involved in plant

reproduction, we identified single-nucleotide polymorphisms (SNPs) that are highly

discriminative in the classification of the flowering time phenotypes. Often, SNPs that overlap

the position of experimentally determined binding sites (e.g. by ChIP-seq), are considered

putative regulatory SNPs. We showed that regulatory SNPs are difficult to pinpoint among the

sea of polymorphisms localized within binding sites determined by ChIP-seq studies. To

overcome this, we narrowed the resolution by focusing on the subset of SNPs that are located

within ChIP-seq peaks but that are also part of known regulatory motifs. These SNPs were used

as input in a classification algorithm that could predict flowering time of Arabidopsis accessions

relative to Col-0. Our strategy is able to identify SNPs that have a biological link with changes

in flowering time. We then surveyed the literature to formulate hypothesis that explain the

regulatory mechanism underlying the difference in phenotype conferred by a SNP. Examples

include SNPs that disrupt the flowering time gene FT; in which the mutation presumably disrupts the binding region of SVP. In Chapter 5 we discuss the steps towards extending our approach to obtain a more comprehensive survey of variants that have an effect on the flowering time control.

In Chapter 4, we propose a method for genome-wide prediction of protein-protein interaction

(PPI) sites form the Arabidopsis interactome. Our method, named SLIDERbio, uses features

encoded in the sequence of proteins and their interactions to predict PPI sites. More specifically,

our method mines PPI networks to find over-represented sequence motifs in pairs of interacting

proteins. In addition, the inter-species conservation of these over-represented motifs, as well as

their predicted surface accessibility, are take into account to compute the likelihood of these

motifs being located in a PPI site. Our results suggested that motifs overrepresented in pairs of

interacting proteins that are conserved across orthologs and that have high predicted surface

accessibility, are in general good putative interaction sites. We applied our method to obtain

interactome-wide predictions for Arabidopsis proteins. The results were explored to formulate

testable hypothesis for the molecular mechanisms underlying effects of spontaneous or induced

mutagenesis on e.g. ZEITLUPE, CXIP1 and SHY2 (proteins relevant for flowering time). In

addition, we showed that the binding sites are under stronger selective pressure than the overall

protein sequence, and that this may be used to link sequence variability to functional


Finally, Chapter 5 concludes this thesis and describes future perspectives in systems biology

applied to the study of GRNs underlying plant reproduction processes. Two key directions are

often followed in systems biology: 1) compiling systems-wide snapshots in which the

relationships and interactions between the molecules of a system are comprehensively

represented; and 2) generating accurate experimental data that can be used as input for the

modelling concepts and techniques or multi-dimensional data analysis. Highlighted in Chapter 5

are the limitations in key steps within the systems biology framework applied to GRN studies.

In addition, I discussed improvements and extensions that we envision for our model related to

the GRN underlying the control of flowering time. Future steps for multi-dimensional data

analysis are also discussed. To sum up, I discussed how to connect the different technologies

developed in this thesis towards understanding the interplay between the roles of the genes,

developmental stages and environmental conditions.

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
  • Angenent, Gerco, Promotor
  • van Dijk, Aalt-Jan, Co-promotor
Award date19 Jan 2015
Place of PublicationWageningen
Print ISBNs9789462572171
Publication statusPublished - 2015


  • systems biology
  • networks
  • models
  • genetic regulation
  • gene expression
  • plants
  • molecular biology

Fingerprint Dive into the research topics of 'Systems biology of plant molecular networks: from networks to models'. Together they form a unique fingerprint.

Cite this