Title Classifying tax evaders by means of social network analytics /
Translation of Title Mokesčių vengėjų klasifikacija naudojant socialinių tinklų analizę.
Authors Daraganas, Lukas
Full Text Download
Pages 30
Keywords [eng] Logistinė regresija, Sprendimai medžiai, Atsitiktiniai miškai, Mokesčių vengimas, klasifikacija Logistic regression, Decision Trees, Random Forests, Tax evasion, classification
Abstract [eng] This thesis uses Logistic Regression, Decision Tree and Random Forest methods to classify companies into tax evading firms and tax compliant firms. Based on Lithuanian VAT law two relationships between persons are important – ownership and kinship. Based on these relationships’ networks are then constructed for each year in a dataset. From each of these networks their network features are then extracted. These include - network size, community structure and others. Regarding the model accuracy of all the models, it is clear that Random Forest model produced best results in all examined models. If these results would persist with different dataset, Random Forest model with AUC score around 0.7 could be useful when selecting potential auditing targets for Lithuanian State Tax Inspectorate.
Dissertation Institution Vilniaus universitetas.
Type Master thesis
Language English
Publication date 2022