Symptom clusters among cancer survivors: what can machine learning techniques tell us?

Koen I. Neijenhuijs, Carel F.W. Peeters, Henk van Weert, Pim Cuijpers, Irma Verdonck deLeeuw*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Purpose: Knowledge regarding symptom clusters may inform targeted interventions. The current study investigated symptom clusters among cancer survivors, using machine learning techniques on a large data set. Methods: Data consisted of self-reports of cancer survivors who used a fully automated online application ‘Oncokompas’ that supports them in their self-management. This is done by 1) monitoring their symptoms through patient reported outcome measures (PROMs); and 2) providing a personalized overview of supportive care options tailored to their scores, aiming to reduce symptom burden and improve health-related quality of life. In the present study, data on 26 generic symptoms (physical and psychosocial) were used. Results of the PROM of each symptom are presented to the user as a no well-being risk, moderate well-being risk, or high well-being risk score. Data of 1032 cancer survivors were analysed using Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) on high risk scores and moderate-to-high risk scores separately. Results: When analyzing the high risk scores, seven clusters were extracted: one main cluster which contained most frequently occurring physical and psychosocial symptoms, and six subclusters with different combinations of these symptoms. When analyzing moderate-to-high risk scores, three clusters were extracted: two main clusters were identified, which separated physical symptoms (and their consequences) and psycho-social symptoms, and one subcluster with only body weight issues. Conclusion: There appears to be an inherent difference on the co-occurrence of symptoms dependent on symptom severity. Among survivors with high risk scores, the data showed a clustering of more connections between physical and psycho-social symptoms in separate subclusters. Among survivors with moderate-to-high risk scores, we observed less connections in the clustering between physical and psycho-social symptoms.

Original languageEnglish
Article number166
JournalBMC Medical Research Methodology
Issue number1
Publication statusPublished - 16 Aug 2021


  • Cancer
  • Machine learning
  • Oncology
  • Symptom clusters


Dive into the research topics of 'Symptom clusters among cancer survivors: what can machine learning techniques tell us?'. Together they form a unique fingerprint.

Cite this