Abstract [eng] |
The goal is to find or create a method suitable for scribble recognition. Four overall machine learning methods were tested: • K Nearest Neighbour using DTW distance measure • Support Vector Machines • Random Forest • Convolutional Neural Networks KNN DTW is proven to be unsuitable due to the nature of vectorised drawing data and the arbitrary class distinctions imposed by R. Kellogg, who naturally approached the problem of classification from a children’s psychological development perspective. Support vector machine and Random Forest methods perform admirably despite minimal feature engineering, with Random Forest proving superior, reaching competitive accuracy and classification speed. Finally, convolutional neural networks are shown to handle scribbles as time-series-like data well, except for significant overfitting, which was solved by taking advantage of data augmentation. Most of R. Kellogg’s classes are easy to augment, bringing the dataset from 7037 scribble instances to 148,029. The best accuracy of this classifier is approximately 98%. |