Title |
Simplifying Lithuanian texts into Easy-to-Read language using large language models / |
Authors |
Kuoraitė, Simona ; Gružauskas, Valentas |
ISBN |
9782970189756 |
Full Text |
|
Is Part of |
1st workshop on Artificial Intelligence and Easy and Plain Language in Institutional Contexts (AI & EL/PL), June 23, Geneva, Switzerland : proceedings of the workshop.. Geneva : European Association for Machine Translation, 2025. p. 30-37.. ISBN 9782970189756 |
Keywords [eng] |
large language model ; Easy-to-Read language ; fine-tuning ; Lithuanian text simplification |
Abstract [eng] |
This paper explores the task of simplifying Lithuanian texts into Easy-to-Read language. Easy-to-Read is a form of language written in short, clear sentences and simple words, adapted for people with intellectual disabilities or limited language skills. The aim of this work is to investigate how the large language model Lt-Llama-2-7b-hf, pre-trained on Lithuanian language data, can be adapted to the task of simplifying Lithuanian texts into Easyto-Read language. To achieve this goal, specialized datasets were developed to fine-tune the model, and experiments were carried out. The model was tested by comparing texts in their original language and texts with a prompt adapted to the task. The results were evaluated using the SARI metric for assessing the quality of simplified texts and a qualitative evaluation of the large language model. The results show that the fine-tuned model sometimes simplifies text better than a model that was not fine-tuned, but that a larger and more extensive dataset would be needed to achieve significant results, and that more research should be carried out on fine-tuning the model for this task. |
Published |
Geneva : European Association for Machine Translation, 2025 |
Type |
Conference paper |
Language |
English |
Publication date |
2025 |
CC license |
|