| Title |
Analysis of Lithuanian blood donor data using BigQuery |
| Authors |
Budrytė, Greta ; Potechina, Viktorija |
| DOI |
10.15388/Gronskis.2026 |
| Full Text |
|
| Is Part of |
20h Prof. Vladas Gronskas international scientific conference, 8th of May 2026, Kaunas, Lithuania : abstract book.. Vilnius : Vilniaus universiteto leidykla. 2026, p. 46-47 |
| Keywords [eng] |
blood donation ; machine learning ; BigQuery ; clustering ; healthcare data |
| Abstract [eng] |
The study is based on analysis of over 1.4 million records of Blood Donor Register, collected by Hygiene Institute in Lithuania, and accessed from EU open data portal. The research aims to identify trends and characterize types of blood donation applying SQL-based analytics by developing machine learning models in Google BigQuery cloud environment. The exploratory data analysis revealed variability in donor health characteristics, differences in hemoglobin levels and donation behavior. K-means clustering enabled to identify four distinct donor groups, and their features for targeted donor segmentation and recruitment strategies. A logistic regression model was developed to predict the donation type (paid vs unpaid), however it showed moderate performance (ROC/AUC ≈ 0.618), class imbalance of data source resulting in favouring the majority class, and high prediction error rate. The machine learning models provided useful insights into donor behavior, their segmentation, and impact. |
| Published |
Vilnius : Vilniaus universiteto leidykla |
| Type |
Conference paper |
| Language |
English |
| Publication date |
2026 |
| CC license |
|