Reliability of multi-source cross-validation estimation in ECG classification : An empirical study
Leinonen, Tuija (2024-05-03)
Reliability of multi-source cross-validation estimation in ECG classification : An empirical study
Leinonen, Tuija
(03.05.2024)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
suljettu
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2024052737248
https://urn.fi/URN:NBN:fi-fe2024052737248
Tiivistelmä
Cardiovascular diseases are a significant global health concern, causing millions of deaths annually and presenting a major burden on healthcare worldwide. The growing prevalence of these conditions underscores the urgent need for effective diagnostic and predictive tools to improve patient outcomes. In recent decades, artificial intelligence has been integrated into cardiac healthcare, offering new insights and improved disease management capabilities. Yet, the widespread adoption of AI-based models in clinical practice faces challenges due to the inadequate use of model evaluation methods, resulting in an overestimation of model performance.
This thesis addresses the urgent issues regarding the application of AI in cardiovascular medicine, focusing on the reliability of model evaluation. Traditionally, clinical prediction models have been evaluated using single source datasets, which has been shown to lead to an overestimation of performance when applied to new sources, such as data from another hospitals. Traditional cross-validation methods, such as K-fold cross-validation, while useful, may not accurately reflect actual performance when confronted with data from unseen sources. The availability of larger and more diverse data sources offers valuable insights into the uncertainty associated with estimated prediction performance.
Through empirical study, this thesis examines the reliability of standard K-fold cross-validation and leave-source-out cross-validation methods in both single source and multi-source settings. By combining and harmonizing data from two sources, including the PhysioNet CinC Challenge 2021 data and the Shandong Provincial Hospital database, this thesis provides a comprehensive evaluation of the reliability of estimates obtained from the cross-validation methods. The results reveal a systematic overestimation of model performance with K-fold cross-validation, especially significant when attempting to apply the model to new data sources. In contrast, leave-source-out cross-validation emerges as a more robust approach, offering more reliable performance estimates albeit with increased variability.
This thesis underscores the risks of obtaining misleading cross-validation results in medical data and demonstrates how these issues can be addressed through access to multi-source data. By acknowledging the limitations of traditional cross-validation methods and embracing more comprehensive evaluation strategies, the healthcare community can enhance the reliability and efficacy of AI-driven solutions in combating cardiovascular diseases globally.
This thesis addresses the urgent issues regarding the application of AI in cardiovascular medicine, focusing on the reliability of model evaluation. Traditionally, clinical prediction models have been evaluated using single source datasets, which has been shown to lead to an overestimation of performance when applied to new sources, such as data from another hospitals. Traditional cross-validation methods, such as K-fold cross-validation, while useful, may not accurately reflect actual performance when confronted with data from unseen sources. The availability of larger and more diverse data sources offers valuable insights into the uncertainty associated with estimated prediction performance.
Through empirical study, this thesis examines the reliability of standard K-fold cross-validation and leave-source-out cross-validation methods in both single source and multi-source settings. By combining and harmonizing data from two sources, including the PhysioNet CinC Challenge 2021 data and the Shandong Provincial Hospital database, this thesis provides a comprehensive evaluation of the reliability of estimates obtained from the cross-validation methods. The results reveal a systematic overestimation of model performance with K-fold cross-validation, especially significant when attempting to apply the model to new data sources. In contrast, leave-source-out cross-validation emerges as a more robust approach, offering more reliable performance estimates albeit with increased variability.
This thesis underscores the risks of obtaining misleading cross-validation results in medical data and demonstrates how these issues can be addressed through access to multi-source data. By acknowledging the limitations of traditional cross-validation methods and embracing more comprehensive evaluation strategies, the healthcare community can enhance the reliability and efficacy of AI-driven solutions in combating cardiovascular diseases globally.