Title |
Effectiveness of pre-trained morphological parsers for text classification / |
Translation of Title |
Iš anksto apmokytų morfologinių analizatorių efektyvumas teksto klasifikacijai. |
Authors |
Šliogeris, Vytenis |
Full Text |
|
Pages |
57 |
Keywords [eng] |
artificial intelligence, machine learning, natural language processing, morphological analyser, text classification, dirbtinis intelektas, mašininis mokymas, skaitmeninis natūraliosios kalbos apdorojimas, morfologinis analizavimas, teksto klasifikacija |
Abstract [eng] |
State-of-the-art models for natural language processing (NLP), such as transformers, employ positional encodings, which, according to our hypothesis, are not as effective at extracting grammatical functions of words for highly inflected languages. This is due to highly inflected languages employing a free word order, making the relative positions of words less related to the meaning of the words, unlike in non-inflected languages. A submodule for extracting grammatical information has been constructed and pre-trained in order to aid the overall model in natural language processing tasks for highly inflected languages. Text classification tasks were performed in the Lithuanian language to evaluate the effectiveness of such a submodule. It was found that the submodule did not significantly improve the performance in solving text classification tasks, with suggestion that such a submodule may be detrimental, yet it could make the learning process more stable. |
Dissertation Institution |
Vilniaus universitetas. |
Type |
Master thesis |
Language |
English |
Publication date |
2024 |