Title Speaker Identification and Voice Impairments detection /
Translation of Title Kalbėtojo identifikavimas ir balso sutrikimų atpažinimas.
Authors Sira Jagadeeshchandra, Anupama
Full Text Download
Pages 72
Keywords [eng] Voice ; Identification ; disorder ; detection
Abstract [eng] Speech Processing being one of the best biometric identification method involves many applications with advantages which facilitated for speaker identification approach. By recognizing those speech features, this research work has two different topics. First part is Text and Language Independent Speaker Identification with comparison of two types of identification classifiers. Features are extracted by extraction methods like Mel Frequency Cepstral Co-efficient, I-Vectors and Vector Quantization and further classified by Euclidean Distance and Probabilistic Linear Discriminant Analysis classifiers for the two different methodologies. The comparison is between the Euclidean Distance and PLDA classifiers for the different performance measures such as Equal Error Rate, Thresholding (False Rejection Rate and False Acceptance Rate), Decision Error Trade-off, Receiver Operating Characteristics and Detection Cost Function. These performance measures are considered according to the NIST 2016 Speaker Recognition Evaluation Plan. In further, to prove that these methodologies perform for different languages, the databases like Libri Speech (English), Uyghur Language (Chinese) and Korean Languages are considered with both male and female speakers of large utterances. As the algorithm works for the classification for Euclidean Distance depend on the thresholding but for Probabilistic LDA depends on the LDA for reduced dimensionality which offer to check for different dimensions and hence results in best output. Thus, the main goal of this implementation is to get lower Equal Error Rate and Detection Cost Function to ensure that the methodologies give maximum accuracy and precision in text-independent and in multilingual conditions. As the second part, using the same techniques like Mel Frequency Cepstral Co-efficient, I-Vectors for features extraction and Support Vector machines for features classification for Voice Impairments Detection is undertaken. The TORGO dysarthric speech databases which contains 3 female and 5 male abnormal voices with different utterances for the classification and Libri Speech for the normal voice samples are considered. This methodology is the best for the classification unless support vector machine is also used for regression in wide range. In this case, SVM is considered for classification in latest version of the MATLAB i.e., MATLAB R2018a for the coding and implementation. As performance measures, accuracy, precision, recall, sensitivity and specificities are calculated as per the data classified by the methodology. And in furtherance, implementation and algorithm flow are explained in detail with experimental results in the enclosure.
Dissertation Institution Šiaulių universitetas.
Type Master thesis
Language English
Publication date 2018