Statements (50)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:algorithm
|
gptkbp:alternativeName |
gptkb:Lloyd's_algorithm
k-means clustering |
gptkbp:application |
image segmentation
document clustering market segmentation vector quantization |
gptkbp:category |
partitioning method
|
gptkbp:complexity |
O(nkt)
|
gptkbp:convergesTo |
local minimum
|
gptkbp:field |
gptkb:machine_learning
statistics data mining |
https://www.w3.org/2000/01/rdf-schema#label |
k-means
|
gptkbp:input |
set of data points
|
gptkbp:introduced |
gptkb:Stuart_Lloyd
|
gptkbp:introducedIn |
1957
|
gptkbp:limitation |
sensitive to outliers
assumes spherical clusters sensitive to initial centroids requires k to be specified |
gptkbp:measures |
Euclidean
Manhattan (less common) |
gptkbp:notRecommendedFor |
categorical data
non-globular clusters |
gptkbp:objective |
minimize within-cluster sum of squares
|
gptkbp:optimizedFor |
gptkb:Lloyd's_algorithm
|
gptkbp:output |
clusters
|
gptkbp:popularizedBy |
gptkb:James_MacQueen
|
gptkbp:popularizedYear |
1967
|
gptkbp:purpose |
partitioning data into clusters
|
gptkbp:relatedTo |
gptkb:DBSCAN
gptkb:Gaussian_mixture_model hierarchical clustering |
gptkbp:requires |
number of clusters (k)
|
gptkbp:software |
gptkb:MATLAB
gptkb:Spark_MLlib gptkb:scikit-learn R |
gptkbp:step |
assign points to nearest centroid
initialize centroids repeat until convergence update centroids |
gptkbp:supportsAlgorithm |
unsupervised learning
|
gptkbp:uses |
Euclidean distance
|
gptkbp:variant |
fuzzy c-means
k-medoids mini-batch k-means |
gptkbp:bfsParent |
gptkb:Bag_of_Visual_Words
|
gptkbp:bfsLayer |
7
|