Treść książki

Przejdź do opcji czytnikaPrzejdź do nawigacjiPrzejdź do informacjiPrzejdź do stopki
HumanitiesandBigData.ExploitingDigitalArchivesintheAgeofAbundance
19
apieceofinformationinopticalnetworkshalveseverynonemonths,whi-
leKryder)sLawstipulatesthattherateofstoragedensityincrease,forhard
diskdrives,followsprocessorspeedimprovementsresultingfromMoore)sLaw
(Walter,2005).
Givenalltheseincredibleadvancements,onemightsuspectthatwesho-
uldnotbeexperiencingsignificantbottleneckswithprocessingthedatawith
contemporarydigitalcomputers.Indeed,inmanycasesthebeststrategyfor
copingwithproblemswithprocessingoflargeamountsofdataisjustwaiting
outuntiltheinevitableincreaseinprocessingpowerandstoragecapacity
eliminatessuchproblems.Teprogressobservedinbioinformaticswouldbe
probablyagoodillustration.Tehumangenomeproject,startedin1990,was
describedatthattimeasa“massive”undertaking,asfarascomputationrequ-
irementsareconcerned,andwasplannedasamulti-yearproject,eventually
completedin2003.Currently,only25yearslater,physiciansareconsidering
personalizedmedicineapproaches,wheresequencingindividualgenomesho-
uldbearoutinetask,carriedoutevenindoctor)sofce.Sincecapabilitiesof
modernpersonalcomputersallowstorageandprocessingofthegeneticin-
formationthatonlycoupleofyearsagorequireddedicated“supercomputers”
installedinuniversitydatacenters.
Personalizedmedicine,mentionedabove,mightbepossiblenotonlydueto
advancementsincomputingpower,butalsoadvancementsingenomesequen-
cingtechnology.Teprocessofdeterminingthesequenceofnucleotidesthat
comprisehumanDNAwas,atthebeginningoftheHumanGenomeProject,
bothcostlyandslow.Today,theso-callednextgenerationsequencingtech-
nologymakesthisprocessbothrapid(indaysorevenhours)andrelatively
cheap,sohospitalscanstartsequencingpatient)sDNAinbulk,andcompara-
tiveanalysisofthesedatabecomespossible.However,thepaceofchangein
DNAsequencingisevenfasterthancomputerimprovementsresultingfrom
Moore)sLaw(andassociatedlawsmentionedabove).Teresultingdatasets
becomenotonlyimpossibletoprocessonasinglecomputer(becausee.g.itis
notpossibletofittheminmemoryorevenonaharddrive),butalso“waiting
out”strategyisineńective.Suchdatathatgrowatlargerpacethanthecapabili-
tiesofcomputingtechnologyareeńectivelythe“bigdata”
,whichrequirenon-
-standardapproachestotheanalysisprocess.Otherexamplesofsuchdatasets
includesocialconnectivitygraphsstoredbymodernsocialnetworkservices
(suchasTwitter,Facebook,orLinkedin)orfulltextindicesrequiredforevery
contemporaryInternetsearchengines.