Applying Supervised Machine Learning for Radiation Dose Accumulation
Puumalainen, Valtteri (2025-02-13)
Applying Supervised Machine Learning for Radiation Dose Accumulation
Puumalainen, Valtteri
(13.02.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025021712678
https://urn.fi/URN:NBN:fi-fe2025021712678
Tiivistelmä
Predicting radiation doses in nuclear power plants is a challenging problem for maintaining the radiation safety of workers whilst ensuring that high exposures do not occur. These radiation doses can be predicted using different sensors, measurements or by manually reviewing the dose history of personnel. However, in this study, a previously unexplored visit-based machine learning approach for predicting radiation doses was developed. This approach utilises time relational data on personnel visits to the controlled area of OL1 and OL2 (Olkiluoto Unit 1 and 2) nuclear power plants, including radiation doses measured during these visits. This allows us to predict visits for different interval classes depending on the radiation dose received.
To provide a comprehensive foundation for machine learning modeling, we also examined the regulations governing current activities and analysed the nature of radiation exposure in nuclear power plant environments, including the origins and effects of radiation. Finally, we evaluated the prerequisites and considerations for deploying a comparable application in a production environment.
Through a combination of literature and experimental analysis, a basis for machine learning analysis was established, adopting five different models: 1) Random Forest, 2) Balanced Random Forest, 3) XGBoost, 4) LightGBM and 5) Easy Ensemble with AdaBoost. Among the models tested, LightGBM achieved the most promising results, however, its performance fell short of expectations due to the inherent imbalance and lack of descriptiveness in the dataset. While the models demonstrated an ability to learn from the data, this learning was insufficient to effectively distinguish between all class intervals. These limitations emphasise the value of integrating additional contextual information, such as the specific work tasks completed during visits, to enhance the dataset's descriptiveness and improve the model's performance.
By addressing these limitations, this study highlights the broader potential for data-driven modelling and further research. Specifically, we demonstrate that the descriptiveness and contextual relevance of data are as, if more, important as its quantity, as the mere existence or abundance of data does not guarantee its applicability to similar data-driven methods.
To provide a comprehensive foundation for machine learning modeling, we also examined the regulations governing current activities and analysed the nature of radiation exposure in nuclear power plant environments, including the origins and effects of radiation. Finally, we evaluated the prerequisites and considerations for deploying a comparable application in a production environment.
Through a combination of literature and experimental analysis, a basis for machine learning analysis was established, adopting five different models: 1) Random Forest, 2) Balanced Random Forest, 3) XGBoost, 4) LightGBM and 5) Easy Ensemble with AdaBoost. Among the models tested, LightGBM achieved the most promising results, however, its performance fell short of expectations due to the inherent imbalance and lack of descriptiveness in the dataset. While the models demonstrated an ability to learn from the data, this learning was insufficient to effectively distinguish between all class intervals. These limitations emphasise the value of integrating additional contextual information, such as the specific work tasks completed during visits, to enhance the dataset's descriptiveness and improve the model's performance.
By addressing these limitations, this study highlights the broader potential for data-driven modelling and further research. Specifically, we demonstrate that the descriptiveness and contextual relevance of data are as, if more, important as its quantity, as the mere existence or abundance of data does not guarantee its applicability to similar data-driven methods.