Treść książki

Przejdź do opcji czytnikaPrzejdź do nawigacjiPrzejdź do informacjiPrzejdź do stopki
20
Chapter1Stateoftheart
6.Annotationschemesshouldpreferablybebasedasfaraspossibleon
4consensual’,theory-neutralanalysesofthecorpusdata.
7.Nooneannotationschemecanclaimauthorityasastandard,although
asamatteroffactinterchange4standards’mayarise,throughwidening
availabilityofannotatedcorpora,andperhapsshouldbeencouraged.
Dash(2005)observesthatinannotatedcorpuslinguisticstherearebasically
threeimportantcriteriathatareusuallyconsideredasimportantinanykind
ofannotation.Thesecriteriaare:consistency,accuracyandspeed.Firstly,as
regardsconsistency,itconcernstheuniformityinannotationthroughoutthe
wholetextofacorpus.Secondly,accuracyisaboutthefreedomfromanykind
oferrorinthetaggingtoadheretothedefinitionsandguidelinesconcerning
theschemeofannotation.Thirdly,theautomaticimplementationofthescheme
ofannotationshouldbepossibleonaverylargedataquantitywithinavery
shortspanoftime.
Abovewementionedtheproblemofthelackofuniformityinannotated
corpuslinguistics.However,itisnottheonlyproblemthatcorpuslinguists
arefacing.Amongothers,thereisalsotheproblemofhowrepresentative
agivencorpusis,andtheproblemofwhatsizeitshouldhaveinorder
toberepresentative.Kohnen(2007)notesthatafirstmajordifficultyin
corpuslinguisticsisconnectedwithcorpussizeasitisnotknownexactly
howlargecorporamustbeinordertoqualifyforvalidlinguisticresearch.
Moreover,hestatesthatonsurveyingthefieldonecangettheimpression
thatevenintheageofso-calledsecond-generationmegacorpora,researchers
seemtobelessconfidentaboutthe4definite’sizethatcorporashouldhave.
Kohnenalsonotesthattheproblemofrepresentativenessisanothercentral
concernincorpuslinguisticsandcorpuslinguistsshouldaimatbuilding
suchcorporathatwouldberepresentative.However,headmitsthatwhen
wearedealingwithrepresentativeness,manyresearchersareveryreserved.
AccordingtoBiberetal.(1998),acorpusisnotamerecollectionof
texts.Acorpusshouldratherseektorepresentalanguageorsomepartof
language.Thereforetheappropriatedesignforacorpusisdependentupon
whatitisgoingtorepresentandthekindsofresearchquestionsthatcan
beaddressed,andthegeneralisabilityoftheresultsoftheresearch,inturn,
isdeterminedbytherepresentativenessofthecorpus.Theyconcludethat“it
isimportanttorealizeupfrontthatrepresentingalanguage1orevenpart
ofalanguage1isaproblematictask.Wedonotknowthefullextentof
variationinlanguagesorallthecontextualvariablesthatneedtobecovered
inordertocaptureallvariationintexts”(p.246).Mukherjee(2004)admits
pessimisticallythatitisnotpossibiletoattainabsoluterepresentativeness,
whereas,accordingtoRömer(2005:41),“alargecorpuscangenerallybe
regardedmorerepresentativeofthetypeoflanguageitconsistsofthan