Projects per year
Abstract
Probabilistic graphical models (PGMs) offer a conceptual architecture where biological and mathematical objects can be expressed with a common, intuitive formalism. This facilitates the joint development of statistical and computational tools for quantitative analysis of biological data. Over the last few decades, procedures based on wellunderstood principles for constructing PGMs from observational and experimental data have been studied extensively, and they thus form a modelbased methodology for analysis and discovery. In this thesis, we further explore the potential of this methodology in systems biology and quantitative genetics, and illustrate the capabilities of our proposed approaches by several applications to both real and simulated omics data.
In quantitative genetics, we partition phenotypic variation into heritable, genetic, and nonheritable, environmental, parts. In molecular genetics, we identify chromosomal regions that drive genetic variation: quantitative trait loci (QTLs). In systems genetics, we would like to answer the question of whether relations between multiple phenotypic traits can be organized within wholly or partially directed network structures. Directed edges in those networks can be interpreted as causal relationships, causality meaning that the consequences of interventions are predictable: phenotypic interventions in upstream traits, i.e. traits occurring early in causal chains, will produce changes in downstream traits. The effect of a QTL allele can be considered to represent a genetic intervention on the phenotypic network. Various methods have been proposed for statistical reconstruction of causal phenotypic networks exploiting previously identified QTLs. In chapter 2, we present a novel heuristic search algorithm, namely the QTL+phenotype supervised orientation (QPSO) algorithm, to infer causal relationships between phenotypic traits. Our algorithm shows good performance in the common, but so far uncovered case, where some traits come without QTLs. Therefore, our algorithm is especially attractive for applications involving expensive phenotypes, like metabolites, where relatively few genotypes can be measured and population size is limited.
Standard QTL mapping typically models phenotypic variations observable in nature in relation to genetic variation in gene expression, regardless of multiple intermediatelevel biological variations. In chapter 3, we present an approach integrating Gaussian graphical modeling (GGM) and causal inference for simultaneous modeling of multilevel biological responses to DNA variations. More specifically, for ripe tomato fruits, the dependencies of 24 sensory traits on 29 metabolites and the dependencies of all the sensory and metabolic traits further on 21 QTLs were investigated by three GGM approaches including: (i) lassobased neighborhood selection in combination with a stability approach to regularization selection, (ii) the PCskeleton algorithm and (iii) the Lasso in combination with stability selection, and then followed by the QPSO algorithm. The inferred dependency network which, though not essentially representing biological pathways, suggests how the effects of allele substitutions propagate through multilevel phenotypes. Such simultaneous study of the underlying genetic architecture and multifactorial interactions is expected to enhance the prediction and manipulation of complex traits. And it is applicable to a range of population structures, including offspring populations from crosses between inbred parents and outbred parents, association panels and natural populations.
In chapter 4, we report a novel method for linkage map construction using probabilistic graphical models. It has been shown that linkage map construction can be hampered by the presence of genotyping errors and chromosomal rearrangements such as inversions and translocations. Our proposed method is proven, both theoretically and practically, to be effective in filtering out markers that contain genotyping errors. In particular, it carries out marker filtering and ordering simultaneously, and is therefore superior to the standard posthoc filtering using nearestneighbour stress. Furthermore, we demonstrate empirically that the proposed method offers a promising solution to genetic map construction in the case of a reciprocal translocation.
In the domain of PGMs, Bayesian networks (BNs) have proven, both theoretically and practically, to be a promising tool for the reconstruction of causal networks. In particular, the PC algorithm and the MetropolisHastings algorithm, which are representatives of mainstream methods to BN structure learning, are reported to have been successfully applied to the field of biology. In view of the fact that most biological systems exist in the form of random network or scalefree network, in chapter 5 we compare the performance of the two algorithms in constructing both random and scalefree BNs. Our simulation study shows that for either type of BN, the PC algorithm is superior to the MH algorithm in terms of timeliness; the MH algorithm is preferable to the PC algorithm when the completeness of reconstruction is emphasized; but when the fidelity of reconstruction is taken into account, the better one of the two algorithms varies from case to case. Moreover, whichever algorithm is adopted, larger sample sizes generally permit more accurate reconstructions, especially in regard to the completeness of the resulting networks.
Finally, chapter 6 presents a further elaboration and discussion of the key concepts and results involved in this thesis.
Original language  English 

Qualification  Doctor of Philosophy 
Awarding Institution 

Supervisors/Advisors 

Award date  18 May 2017 
Place of Publication  Wageningen 
Publisher  
Print ISBNs  9789463431538 
DOIs  
Publication status  Published  2017 
Keywords
 probabilistic models
 models
 networks
 linkage
 mathematics
 statistics
 quantitative trait loci
 phenotypes
 simulation
Fingerprint Dive into the research topics of 'Using probabilistic graphical models to reconstruct biological networks and linkage maps'. Together they form a unique fingerprint.
Projects
 1 Finished

Regulatonary network reconstruction.
Wang, H., Jansen, H. & van Eeuwijk, F.
19/10/09 → 18/05/17
Project: PhD