Treść książki

Przejdź do opcji czytnikaPrzejdź do nawigacjiPrzejdź do informacjiPrzejdź do stopki
JerzyTchórzewski,TomaszKania
assigningnobjectstoapredeterminednumberofclustersk.Regardless
ofthevalueoftheparameterk,thealgorithmdoesnotrelyonpreviously
determinedclustersseparatelyforeachvalue.
Thenumberkisakeyinputparameterthatispassedasasetpointor
randomvalue.Itdeterminesthenumberofclustersobtainedasaresultof
thealgorithm'soperation,thereforeitisveryimportanttoselectanade-
quatecriterionfunctionbywhichthequalityofthegroupingwillbeas-
sessed.
Thedisadvantageofthenon-hierarchicalclusteranalysismethodisits
greed,andthereforeonlythelocaloptimumisobtainedasaresult.More-
over,thereisnoguaranteethattheglobaloptimumwillbeachievedatall,
butthegreatadvantageistheeaseofimplementationofthismethodand
thelowcomputationalcomplexity[4-5,28,30].
K-MeansMethod
Themethodsofnon-hierarchicalclusteranalysisincludecombinatorial
clusteranalysisalgorithms,includingthek-meansalgorithm.Thisalgo-
rithmisanadaptationoftheLloydheuristicfordeterminingthematrixof
objectassignmentstoapredeterminednumberofgroupsandthematrix
ofcentersofgravityofthesegroupsinsuchawaythatthequalityindexis
minimized[30].Thisalgorithmisbasedonminimizedvariabilitywithinthe
resultingclusters.Itisthemosttypicalmethodofthisgroupofalgorithms
andthereforethevariabilitybetweenclustersisautomaticallymaximized.
Moreprecisely,elementsaremovedfromclustertoclusteruntilthewithin-
groupandbetweenclustervariationsareoptimized.Themostimportant
aspectofthismethodisthevariabilityoptimization,wheretheobjective
functionistominimizethetraceofacertainintragroupcovariancematrix,
whichisoneofthetwodecompositionmatricesofthescatteringmatrix
(thesecondmatrixistheintergroupcovariancematrix)[12-13,27,34].
Withthismethod,kclusters,asdiverseaspossible,arecreated.
Asreportedbytheauthorsofthework[30],despitetheover50-year
historyofthek-meansalgorithm,therearerelativelyfewworksanalyzing
itsproperties,orworksformulatingitsconvergences.However,itisavery
simpleandeffectivealgorithmforfindinggroupsinthedatabase,andthe
mostcommonmeasureofdistanceistheEuclideandistance,although
othermeasuresofobjectsimilarityarealsoused.
Inpractice,however,variousvariantsofthek-meansalgorithm
areused.Theyinclude,amongotherstheclassick-meansalgorithm,
18