Radiomics represents a specialized branch of medical imaging where quantitative features are extracted from images. Performing a classification using radiomics means solving two common problems: the imbalanced setting, and the large number of features that would increase the risk of overfitting. Moreover, since its main application and impact are in clinical field, there is the need of interpretable models for explaining their results. The aim of this study is to compare two modelling approaches: one based on a logistic regression model, known for its simplicity and interpretability, and RUSBoost, an ensemble method designed to handle class imbalance with potentially higher complexity, in order to answer the question whether higher complexity and lower interpretability are justified when dealing with radiomics data. Additionally, due to the large literature suggesting it, we analyze the impact of a feature selection step applied to these two classifiers. Test performances measured across 20 repeated splits on two datasets show how the RUSBoost approach is able to capture more detailed patterns of the data but this is highly dependent on the dataset at hand.
Dettaglio pubblicazione
2024, Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024, Pages 6937-6942
Radiomics-based classification in imbalanced datasets: Complexity or interpretability (04b Atto di convegno in volume)
Boesso S., Farina L., Petti M.
ISBN: 979-8-3503-8622-6; 979-8-3503-8623-3
keywords