The potential for using whole genome sequencing (WGS) data in microbiological risk assessment (MRA) has been discussed on several occasions since the beginning of this century. Still, the proposed heuristic approaches have never been applied in a practical framework. This is due to the non-trivial problem of mapping microbial information consisting of thousands of loci onto a probabilistic scale for risks. The paradigm change for MRA involves translation of multidimensional microbial genotypic information to much reduced (integrated) phenotypic information and onwards to a single measure of human risk (i.e. probability of illness).In this paper a first approach in methodology development is described for the application of WGS data in MRA; this is supported by a practical example. That is, combining genetic data (single nucleotide polymorphisms; SNPs) for Shiga toxin-producing Escherichia coli (STEC) O157 with phenotypic data (in vitro adherence to epithelial cells as a proxy for virulence) leads to hazard identification in a Genome Wide Association Study (GWAS).This application revealed practical implications when using SNP data for MRA. These can be summarized by considering the following main issues: optimum sample size for valid inference on population level, correction for population structure, quantification and calibration of results, reproducibility of the analysis, links with epidemiological data, anchoring and integration of results into a systems biology approach for the translation of molecular studies to human health risk.Future developments in genetic data analysis for MRA should aim at resolving the mapping problem of processing genetic sequences to come to a quantitative description of risk. The development of a clustering scheme focusing on biologically relevant information of the microbe involved would be a useful approach in molecular data reduction for risk assessment.
- Risk assessment