Flexible tree-structured regression for clustered data with an application to quality of life in older adults

Tree-structured models are a powerful alternative to parametric regression models if non-linear effects and interactions are present in the data. Yet, classical tree-structured models might not be appropriate if data comes in clusters of units, which requires taking the dependence of observations in...

Full description

Saved in:
Bibliographic Details
Main Authors: Spuck, Nikolai (Author) , Schmid, Matthias (Author) , Berger, Moritz (Author)
Format: Article (Journal)
Language:English
Published: 07 March 2026
In: Advances in data analysis and classification

ISSN:1862-5355
DOI:10.1007/s11634-026-00666-9
Online Access:Verlag, kostenfrei, Volltext: https://doi.org/10.1007/s11634-026-00666-9
Get full text
Author Notes:Nikolai Spuck, Matthias Schmid, Moritz Berger
Description
Summary:Tree-structured models are a powerful alternative to parametric regression models if non-linear effects and interactions are present in the data. Yet, classical tree-structured models might not be appropriate if data comes in clusters of units, which requires taking the dependence of observations into account. This is, for example, the case in cross-national studies, as presented here, where country-specific effects should not be neglected. To address this issue, we present a flexible tree-structured approach that achieves a sparse modeling of unit-specific effects and identifies subgroups (based on individual-level covariates) that differ with regard to the outcome. The methodological advances were motivated by the analysis of quality of life in older adults using data from the survey of Health, Ageing and Retirement in Europe. Application of the proposed model yields promising results and illustrated the accessibility of the approach. A comparison to alternative methods with regard to variable selection and goodness-of-fit was performed in several simulation experiments. The tree-structured model performed well in settings with a low number of units, a large number of observations per unit and a relatively low number of clusters of units. The key advantages of the approach lie in its high explainability and the intuitive interpretability of the results.
Item Description:Gesehen am 07.04.2026
Physical Description:Online Resource
ISSN:1862-5355
DOI:10.1007/s11634-026-00666-9