Title Analysis of Lithuanian blood donor data using BigQuery
Authors Budrytė, Greta ; Potechina, Viktorija
DOI 10.15388/Gronskis.2026
Full Text Download
Is Part of 20h Prof. Vladas Gronskas international scientific conference, 8th of May 2026, Kaunas, Lithuania : abstract book.. Vilnius : Vilniaus universiteto leidykla. 2026, p. 46-47
Keywords [eng] blood donation ; machine learning ; BigQuery ; clustering ; healthcare data
Abstract [eng] The study is based on analysis of over 1.4 million records of Blood Donor Register, collected by Hygiene Institute in Lithuania, and accessed from EU open data portal. The research aims to identify trends and characterize types of blood donation applying SQL-based analytics by developing machine learning models in Google BigQuery cloud environment. The exploratory data analysis revealed variability in donor health characteristics, differences in hemoglobin levels and donation behavior. K-means clustering enabled to identify four distinct donor groups, and their features for targeted donor segmentation and recruitment strategies. A logistic regression model was developed to predict the donation type (paid vs unpaid), however it showed moderate performance (ROC/AUC ≈ 0.618), class imbalance of data source resulting in favouring the majority class, and high prediction error rate. The machine learning models provided useful insights into donor behavior, their segmentation, and impact.
Published Vilnius : Vilniaus universiteto leidykla
Type Conference paper
Language English
Publication date 2026
CC license CC license description