In silico analysis of design of experiment methods for metabolic pathway optimization

Sara Moreno-Paz, Joep Schmitz, Maria Suarez-Diez*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Microbial cell factories allow the production of chemicals presenting an alternative to traditional fossil fuel-dependent production. However, finding the optimal expression of production pathway genes is crucial for the development of efficient production strains. Unlike sequential experimentation, combinatorial optimization captures the relationships between pathway genes and production, albeit at the cost of conducting multiple experiments. Fractional factorial designs followed by linear modeling and statistical analysis reduce the experimental workload while maximizing the information gained during experimentation. Although tools to perform and analyze these designs are available, guidelines for selecting appropriate factorial designs for pathway optimization are missing. In this study, we leverage a kinetic model of a seven-genes pathway to simulate the performance of a full factorial strain library. We compare this approach to resolution V, IV, III, and Plackett Burman (PB) designs. Additionally, we evaluate the performance of these designs as training sets for a random forest algorithm aimed at identifying best-producing strains. Evaluating the robustness of these designs to noise and missing data, traits inherent to biological datasets, we find that while resolution V designs capture most information present in full factorial data, they necessitate the construction of a large number of strains. On the other hand, resolution III and PB designs fall short in identifying optimal strains and miss relevant information. Besides, given the small number of experiments required for the optimization of a pathway with seven genes, linear models outperform random forest. Consequently, we propose the use of resolution IV designs followed by linear modeling in Design-Build-Test-Learn (DBTL) cycles targeting the screening of multiple factors. These designs enable the identification of optimal strains and provide valuable guidance for subsequent optimization cycles.

Original languageEnglish
Pages (from-to)1959-1967
Number of pages9
JournalComputational and Structural Biotechnology Journal
Volume23
DOIs
Publication statusPublished - 1 May 2024

Keywords

  • Cell factory
  • Design of experiments
  • Pathway

Fingerprint

Dive into the research topics of 'In silico analysis of design of experiment methods for metabolic pathway optimization'. Together they form a unique fingerprint.

Cite this