Title Duomenų tyrybos sistemų galimybių tyrimas įvairių apimčių duomenims analizuoti /
Another Title Investigation of the abilities of data mining systems to analyse various volume datasets.
Authors Paulauskienė, Kotryna ; Kurasova, Olga
DOI 10.15388/Im.2013.0.2052
Full Text Download
Is Part of Informacijos mokslai / Vilniaus universitetas.. Vilnius : Vilniaus universiteto leidykla. 2013, t. 65, p. 85-95.. ISSN 1392-0561
Abstract [eng] The aim of the paper is to determine what volume of data the popular data mining systems are able to analyse within a reasonable period of time, when solving classification and clustering problems. Three open source data mining systems are investigated: WEKA, KNIME, and ORANGE. The experiments have been carried out with eight datasets, where the number of attributes was fixed – 100 and the number of instances ranged between 5000 and 600 000. The experimental investigation has shown that when the ORANGE system is used, the data of more than 50 000 instances are of too large volume. In order to analyse larger datasets, the WEKA and KNIME systems need to be used. The data of more than 200 000 instances are of too large volume for WEKA and KNIME, however, when simple classification methods are used, both systems are able to handle 400 000 instances, and KNIME – 600 000 instances. The results have showed that KNIME can handle larger datasets than WEKA, when applying some classification methods. The accuracy of classification is high enough, when the classification methods, implemented in the systems, are used.
Published Vilnius : Vilniaus universiteto leidykla
Type Journal article
Language Lithuanian
Publication date 2013