TY - JOUR
T1 - Validation of stock assessment methods
T2 - Is it me or my model talking?
AU - Kell, Laurence T.
AU - Sharma, Rishi
AU - Kitakado, Toshihide
AU - Winker, Henning
AU - Mosqueira, Iago
AU - Cardinale, Massimiliano
AU - Fu, Dan
N1 - Publisher Copyright:
© 2021 International Council for the Exploration of the Sea.
PY - 2021/6/10
Y1 - 2021/6/10
N2 - The adoption of the Precautionary Approach requires providing advice that is robust to uncertainty. Therefore, when conducting stock assessment alternative, model structures and data sets are commonly considered. The primary diagnostics used to compare models are to examine residuals patterns to check goodness-of-fit and to conduct retrospective analysis to check the stability of estimates. However, residual patterns can be removed by adding more parameters than justified by the data, and retrospective patterns removed by ignoring the data. Therefore, neither alone can be used for validation, which requires assessing whether it is plausible that a system identical to the model generated the data. Therefore, we use hindcasting to estimate prediction skill, a measure of the accuracy of a predicted value unknown by the model relative to its observed value, to explore model misspecification and data conflicts. We compare alternative model structures based on integrated statistical and Bayesian state-space biomass dynamic models using, as an example, Indian Ocean yellowfin tuna. Validation is not a binary process (i.e. pass or fail) but a continuum; therefore, we discuss the use of prediction skill to identify alternative hypotheses, weight ensemble models and agree on reference sets of operating models when conducting Management Strategy Evaluation.
AB - The adoption of the Precautionary Approach requires providing advice that is robust to uncertainty. Therefore, when conducting stock assessment alternative, model structures and data sets are commonly considered. The primary diagnostics used to compare models are to examine residuals patterns to check goodness-of-fit and to conduct retrospective analysis to check the stability of estimates. However, residual patterns can be removed by adding more parameters than justified by the data, and retrospective patterns removed by ignoring the data. Therefore, neither alone can be used for validation, which requires assessing whether it is plausible that a system identical to the model generated the data. Therefore, we use hindcasting to estimate prediction skill, a measure of the accuracy of a predicted value unknown by the model relative to its observed value, to explore model misspecification and data conflicts. We compare alternative model structures based on integrated statistical and Bayesian state-space biomass dynamic models using, as an example, Indian Ocean yellowfin tuna. Validation is not a binary process (i.e. pass or fail) but a continuum; therefore, we discuss the use of prediction skill to identify alternative hypotheses, weight ensemble models and agree on reference sets of operating models when conducting Management Strategy Evaluation.
KW - diagnostics
KW - hindcast
KW - prediction skill
KW - retrospective analysis
KW - stock assessment
KW - validation
U2 - 10.1093/icesjms/fsab104
DO - 10.1093/icesjms/fsab104
M3 - Article
AN - SCOPUS:85103986420
VL - 78
SP - 2244
EP - 2255
JO - ICES Journal of Marine Science
JF - ICES Journal of Marine Science
SN - 1054-3139
IS - 6
ER -