Treść książki

Przejdź do opcji czytnikaPrzejdź do nawigacjiPrzejdź do informacjiPrzejdź do stopki
20
Web2.0Revolution
Thisbelief,beingquestionableafewyearsback,gainsmoreweightasthe
numberoffreeservicesincreases.Inalmostallsoftwarecategoriesonemay
findopensourcereplacementsofproprietarysoftware,beingitofficesuites,
businessintelligencetools,webanalytics,hostingservices,etc.Finally,the
mostimportantfifthrulestatesthatthemainreasonforlock-inandthe
mostvaluableassetisneithersoftwarenorhardware,butthedataitself.
Thisconvictionhasbeenlongpopularamongdatabaseengineers,butnot
widespreadincomputersciencecommunity.Softwareplatformschangeevery
fiveyears,hardwareplatformschangeeverysevenyears,butthedatare-
mainforever.Thisclaimmayseemexaggerated,butitiscounter-intuitively
correct.Mostcontemporarydatabasesystemsdonotconsiderdeletingany
dataandmanyfreeservicesdonotimposelimitsonthevolumeofdata
beingstored.Asofthetimeofwriting,thepopularfreemailhostingser-
viceGMail[141]offersitsusers6628MBoffreestoragespaceandadds
another3MBtoeachuserperday.Storingandqueryingdatahasbecome
easierwithadvancesinrelationaldatabasetechnology,butacquiringdata
andfeedbackfromusersisdifficult.ThemainmantraofWeb2.0propo-
nentsistoturnspectatorsintoparticipants.Thismeanscreatingincentives
foruserstoparticipateandcontribute.Manysolutionsfromtheworldof
Web2.0relyheavilyonusers?input,e.g.,folksonomies,blognetworks,wikis,
recommenderengines.Companiesthatmanagetoattractthebiggestcrowd
ofdevotedfollowersaremostlikelytogaincompetitiveedgeandwinthe
race.Othersourcesofsuccessarenamespacesandproprietaryformats.Ex-
amplesofnamespacesincludeCompactDiscDatabase(CDDB)operated
byGracenoteInc.,NetworkSolutionsmanagingalmost8milliondomain
names,orVeriSignmanagingdigitalcertificates,paymentsprocessing,and
rootnameservers.Proprietaryformatsinclude,amongothers,formatsfrom
MicrosoftOfficetoolsorAppleiTunes.Alltheseexamplesrepresentpopular
formatsandsolutionsthathavereachedthecriticalmassofpopularityand
acceptance,andareperceivedasdefactostandardsdespitebeingpropri-
etaryassets.
Theaboverulesdefinethetransitiontoanewsoftwaredevelopment
paradigm,anewapproachtousers,andanewapproachtocollaboration
andcooperation.Byitself,thesechangesshouldnotaffectanydatamin-
ingactivities,becausethealgorithmsforpatterndiscoveryareobliviousto
particularsoftwarearchitecture.However,theparadigmshiftenforcedby
theWeb2.0revolutionstronglyaffectsthedatagatheredandprocessedby
Web2.0applications.AswewilldiscussinChapter2,thechangestodata
distributions,dataqualityanddatasemanticsaresoprofound,thattradi-
tionaldataminingalgorithmscannotcopewiththesedataandnewmethods
mustbedevelopedinordertodiscovermeaningfulpatterns.