Title HALT-PROP: human-annotated Lithuanian textual corpus for propaganda narratives and techniques
Authors Rizgelienė, Ieva ; Zubaitienė, Vilma ; Maliukevičius, Nerijus ; Marcinkevičius, Virginijus
DOI 10.1038/s41597-025-06367-w
Full Text Download
Is Part of Scientific data.. Berlin : Nature Portfolio. 2026, vol. 13, art. no. 47, p. [1-20].. ISSN 2052-4463
Keywords [eng] propaganda ; textual corpus ; propaganda techniques ; propaganda narratives
Abstract [eng] In the contemporary technological landscape, propaganda has become one of the most pervasive tools in information warfare. Social media platforms and entire media ecosystems are leveraged to disseminate hostile propaganda aimed at polarizing societies, destabilizing states, and eroding longstanding democratic processes. Malign propaganda is not only common in widely-spoken languages but also targets less-spoken languages to maximize its reach and influence. While progress has been made in developing models capable of detecting propaganda, most advances have focused on high-resource languages. In contrast, low-resource languages continue to face significant limitations, the most critical being the scarcity of annotated datasets. In many regions and countries, such resources are entirely absent. To address this gap, we present the HALT-PROP dataset, the first human-annotated Lithuanian textual propaganda corpus. The corpus comprises two complementary datasets: (1) 2,870 news articles manually labeled by five experts at the article level to identify the presence of propaganda; and (2) a subset of 1,000 articles annotated for specific propaganda techniques and narratives using a cross-annotation approach.
Published Berlin : Nature Portfolio
Type Journal article
Language English
Publication date 2026
CC license CC license description