Abstract [eng] |
This dissertation focuses on machine learning methods for textual data sentiment analysis in large scale datasets. A hybrid sentiment analysis method with a recommended set of parameters for large textual data with a better execution time and with a similar classification accuracy compared with classical methods is proposed in this research. The proposed hybrid method is a combination of four methods: classical machine learning algorithm, k-Means clustering, particle swarm optimization metaheuristic and ensemble, which are integrated into acceleration method – SpeedUP. The SpeedUP method is the main method, which automatically performs all parts of the proposed hybrid method, based on specified parameters, which are recommended in this research and are set as default in the SpeedUP method. The proposed hybrid method is tested with well-known classical machine learning algorithms: multinomial naïve Bayes, logistic regression, linear support vector machine, decision tree and random forest. The results obtained are compared with a focus on evaluating classification accuracy and their significance is tested with Welch's t-test, which is used in statistics to test the hypothesis. |