Treść książki

Przejdź do opcji czytnikaPrzejdź do nawigacjiPrzejdź do informacjiPrzejdź do stopki
JerzyTchórzewski,TomaszKania
Inordertoquantifythesimilaritybetweenobjects,theso-calledameas-
ureofsimilarityorameasureofdissimilarity,asadualmeasure,thatis,
thelowerthevalueofdissimilarity,themoresimilarobjectsare.Aspecial
caseofdissimilarityisthedistancebetweenobjectsasaspecialkindof
metric.Forthesereasons,themostunambiguousistocompareobjects
withquantitativefeatures,hencethekeyroleintheconcentrationanalysis
isplayedbydistancemeasures,i.e.thedissimilarityfunctionsofapairof
objectsthatdescribethedegreeofsimilarityofobjects.
Thentheelementsseparatedbyasmalldistancearecombinedinto
clusters.Ingeneral,thedistancefunctioniscomputedfromthequalitative,
ordinal,orquantitativevariablesthatcharacterizetheobjectsinquestion,
andthecomputationreturnsadistancevalueexpressedasanon-negative
numberfromthesetofrealnumbers.
Themostfrequentlyuseddistancemeasuresinclude:Minkowskidis-
tance,Mahalanobisdistance,Bregmandistance,cosinedistance,power
distance,Chebyshevdistance,Euclideandistance,Euclideandistance
squared,citydistanceandmanyotherdistancemeasures[5,13,14,30].
Overtheyears,manydifferentclusteranalysisalgorithmshavebeen
developed,althoughduetothegroupingapproach,twobasicmethodsof
datadivisionemerge[5-6,14,30]:hierarchicalclusteranalysismethods
andnon-hierarchicalclusteranalysismethods,whilenon-hierarchicalclus-
teranalysismethods,alsoknownascombinatorialmethods,arebasedon
assigningobjectstocreatedclusters,andhierarchicalclusteranalysis
methodsusethepossibilitiesofcreatingaclassificationhierarchy,thatis,
creatingahierarchyofclassesforaspecifiednumberofobservations.
1.2.1.Hierarchicalmethodsofclusteranalysis
Themethodsofhierarchicalclusteranalysisbelongtothetraditionalmeth-
odsofclusteranalysisandconsistinsuccessivelycombiningordividing
theobservations[30],whichinturnleadstoobtainingatree(dendogram).
Clusteranalysisismainlyaimedatobtaininghomogeneousclustersof
data.Aswithalmostanydataminingtechnique,therearemanydifferent
choicesforclassifyingyourdatabreakdownintoclusters.Themostfre-
quentlyusedcriteriainhierarchicalclusteranalysisarethesimilarityofthe
objectsassignedtoagivengrouporthedissimilarityoftheelementsof
onegroupfromtheothergroupsofclusters.
14