Projects per year
Chapter 1 provides the background for the thesis. In it, I provide a short overview of what biotechnology is and how it has been utilized it for thousands of years. I then address how modern biotechnology has evolved over the past few decades. Its progress has been triggered by the discovery of nucleic acids and marked by a focus on genetic understanding of cell and organism function and on the subsequent manipulation to ultimately benefit society in one way or another. Furthermore, computational biology has been increasingly important to determine the success of biotechnological research, in for example an anti-malarial drug- producing yeast. However, for microalgae, which are very promising organisms for biotechnological applications, there are essentially no successful commercialized examples of modern biotechnology. The chapter further discusses the importance of computationally predicting protein functions and its role in bioinformatics and systems biology research, concluding that this is one of the challenges for microalgal biotechnology. This topic is discussed in all chapters of this thesis, as its overarching goal is to develop and deploy tools and methodologies that lead to increase our understanding of microalgae as cell factories.
Chapter 2 is an review on the state of microalgal biotechnology in 2014, of which the major discussion points are still valid, and how bioinformatics and systems biology should be used to further microalgal research. It describes the challenges of microalgal genomics, bioinformatics, and systems biology research. The chapter addresses a few challenges for microalgae in particular: a lack of genomic data, a low amount of validated protein functions, and genome-scale metabolic models largely based off of Arabidopsis thaliana. Suggestions are made on how to overcome these challenges, by for example better utilizing bioinformatics methods and databases. Chapter 3 addresses a specific challenge: the need for accurate annotation of the functions of microalgal proteins. It exposes the lack of understanding we have of their protein functions, with a staggering 90% of their annotations also present in the distantly related plant Arabidopsis thaliana. Finally, this chapter outlines areas in which microalgal protein function prediction can be improved. In Chapter 4, I present CrowdGO, a prediction tool based on the “wisdom of the crowd” principle for protein function prediction that aims to overcome the major problem highlighted in Chapter 3. It operates by taking and merging the existing predictions made by other methods. These merged predictions are then put through a machine learning algorithm which is trained to recognize patterns in these predictions and correlate them to true or false positives. CrowdGO shows significantly higher accuracy, with a p-value < 2.22e-16, over existing prediction methods, as well as an improved precision and recall optimum.
In Chapter 5 deploy CrowdGO to the genomics of the oleaginous yeast Cutaneotrichosporon curvatus, which thus serves as a real biological test case for the method. Comparisons between the CrowdGO annotated C. curvatus proteins to the existing ones of a related yeast showcases the potential of CrowdGO. GO enrichment analysis of C. curvatus between transcriptomes of normal growth conditions and nitrogen starved conditions shows cell maintenance functions enriched during the first, and stress functions enriched during the latter. This is in line with what one would expect for an oleaginous eukaryote and reassures us that the CrowdGO annotations are reliable. The CrowdGO annotations are further used in a manual annotation pipeline, which we used to manually curate over 700 metabolic C. curvatus proteins. These are used together with differential expression analysis to characterize triacylglycerol synthesis during nitrogen starvation conditions. Only one enzyme was missing after the first round of annotations, displaying a high recall for enzymes when using the manual annotation pipeline.
In chapter 6 we study the comparative genomics between different Botryococcus braunii strains, an oleaginous eukaryote that either makes large amount of polysaccharides or hydrocarbons based on the strain. In this chapter, all methodologies discussed or developed in the previous chapters are used to try and identify the key genetic differences between the two strains that lead to polysaccharide or hydrocarbon synthesis. We use CrowdGO to annotate all the proteins, and perform manual annotation on a thousand metabolic proteins. These are used in conjunction with quantitative proteomics analysis of several conditions including light and dark, different nitrogen levels, and different cell phases. By combining the manual annotations and the proteomics analysis, we were able to characterize several key pathways including the non-mevalonate pathway, fucose synthesis pathway, and the TCA cycle. Analysis of these pathways reveals key differences in the expression of enzymes that are likely to correspond to polysaccharide or hydrocarbon synthesis. Apart from revealing some key features about Botryococcus braunii, this chapter serves as a template for future large-scale microalgal research.
Chapter 7 is a general discussion on the thesis. In it, I discuss how the work in this thesis relates to the SPLASH project for microalgae. Furthermore, I discuss how microalgal annotations can still be improved through the use of various stages of bioinformatics, systems biology, and synthetic / metabolic engineering research. Finally, I discuss how microalgae have potential as protein farms, and how it might be possible to unlock this potential.
|Qualification||Doctor of Philosophy|
|Award date||13 Nov 2018|
|Place of Publication||Wageningen|
|Publication status||Published - 2018|