Abstract [eng] |
This master’s thesis proposes and analyses a collection of methods and prototypes (based on the aforementioned methods) that are capable of detecting violent Lithuanian language comments in internet news websites. Support-vector machines, naive Bayes and FastText machine learning algorithms are used for text classification. These algorithms are provided with texts that are processed using stemming algorithms and complemented with non-text parameters from comments as well as sentiment markers. The sentiment markers are calculated by machine learning algorithms trained on internet-based reviews for goods and services. When compared to publicly available benchmark classifiers, the proposed violent content detection system prototypes have managed to achieve better results in all relevant statistical metrics. Keywords: sentiment analysis, text classification, internet news websites, internet comments. |