Title Accent identification using machine learning /
Translation of Title Akcento identifikavimas naudojant mašininį mokymąsi.
Authors Grigaliūnaitė, Justina
Full Text Download
Pages 50
Keywords [eng] Accent Identification, Spectrogram, ResNet, Support Vector Machine
Abstract [eng] Accent identification can offer performance improvements for many Automatic Speech Recognition (ASR) systems. This thesis focuses on phoneme-based accent identification and proposes a classification approach for two speaker groups: native French and American speakers, detecting accent in pronounced French words and cut out /u/-/y/ sounds. The classification is carried out on a new psycholinguistic dataset derived from experimental research. Our developed approach could be used as a pre-processing step in complex ASR systems. Moreover, this issue is of great relevance for the development of methodology in linguistics and cognitive psychology as they lack objective ways to efficiently identify and assess foreign accent. Underlying idea for class difference is based on linguistic research and the fact that it is extremely difficult to reproduce some language-specific sounds for non-native speakers which results in foreign accent. Scientific experiment presented in the thesis consists of several parts: classification of vowel and word samples represented as spectrograms using Residual Neural Network (ResNet), comparison with baseline Support Vector Machine (SVM) model employing Mel-frequency Cepstral Coefficients (MFCCs) and suggested SVM classification improvement using spectrogram data. ResNet classification results show up to 81.6\% accuracy for accent identification from vowel samples and up to 93.06\% accuracy using word samples. The results are significantly better than those produced by a baseline model of 66.46\% and 67.34\% accuracy for vowel and word respectively. In addition, the thesis suggests possible performance improvement for the baseline SVM model by using spectrogram representation of a signal instead of widely used MFCCs. Thesis provides introductory analysis of interdisciplinary accent identification topic which requires linguistic basics, sound signal processing and supervised learning knowledge. The results of this study shows the suitability of 2D spectrogram representation and ResNet model for sound data classification. It also raises some further questions that can be researched in order to improve accent identification even more. In addition to this, the thesis succeeded in solving an accent classification problem, that traditional acoustic methods used by psycholinguists failed to tackle. This indicates that Machine Learning based methods can improve research methodology in psycholinguistics and other fundamental research fields.
Dissertation Institution Vilniaus universitetas.
Type Master thesis
Language English
Publication date 2022