Title of object
Efficiency of Random Sampling Based Data Size Reduction on Computing Time and Validity of Clustering in Data Mining
Object description
In data mining, cluster analysis is one of the widely used analytics to discover existing groups in datasets. However, the traditional clustering algorithms become insufficient for
the analysis of big data which have been formed with the enormous increase in the
amount of collected data in recent years. Therefore, the scalability has been one of the
most intensively studied research topics for clustering big data. The parallel clustering
algorithms and the Map-Reduce framework based techniqu...
Language of object
English
Author(s) / Publisher
Journal of Agricultural Informatics
Discipline(s)
Computer and Library Sciences
Keywords
data reduction, random sampling, cluster analysis, external validity indices, big data, k-means clustering
Technical format
Text
Intended end-user role
Other
Learning resource type
Research
Typical age range
18Ü
Copyright - Restrictions
This resource is licensed under the license(CC-BY-NC-ND) Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported
Cost with use
No
Source code available
No