valueofOUTk(n)fromthepreviousiterationoldoutk
beginComputeReachingDefs
foreachnoden∈CFGdoOUT(n)=GEN(n)endforchange=truewhilechangedo
change=false
foreachnoden∈CFGdo
/*compute INd(n),saveOUTd(n),andrecomputeOUTd(n)*/INd(n)=OUTd(p),wherepisapredecessorofnintheCFGoldoutd=OUTd(n)
OUTd(n)=GEN(n)∪(INd(n) KILLd(n) KILLp(n))/*compute INp(n),saveOUTp(n),andrecomputeOUTp(n)*/INp(n)=OUTp(p),wherepisapredecessorofnintheCFGoldoutp=OUTp(n)
OUTp(n)=(INp(n) KILLd(n))∪(INd(n)∩KILLp(n))/*computeINk(n),saveOUTk(n),andrecomputeOUTk(n)*/ INk(n)=OUTk(p),wherepisapredecessorofnintheCFGoldoutk=OUTk(n)
OUTk(n)=INk(n)∪(INd(n)∪INp(n)∩KILLd(n))
ifoldout=OUT(n)oroldout=OUT(n)oroldout=OUT(n)then
change=trueendifendforendwhile
endComputeReachingDefs
Figure5:Thealgorithmforcomputingreachingde nitionstoidentifydef-useassociationsofdi erenttypes.
Data dependences, which relate statements that compute data values to statements that use those values, are useful for automating a variety of program-comprehension-related activities, such as reverse engineering, impact analysis, and debugging. Unfortunat
Table5:Programsusedfortheempiricalstudiesreportedinthepaper.
Subjectansitape
Benchmarkprogram
di
Compress/extractutility
replacetotunzip
Antenna-arrayspeci cationparserStatisticalinformationcombiner
2906
5511530LOC1106
100percentage of total def useassociations
806040200Figure6:Distributionofdatadependencesusingourclassi cationandOstrandandWeyuker’sclassi cation.
3In
twoofthesubjects,twocallsinvolvingfunctionpointershavebeensubstitutedwithcallshavingthesamedata- ow
e ect.
ty 1pety 3pe ty5pe ty6pe ty7pety 9pe ty10pe ty13pe ty14pety 15pety 17pe ty19pe ty20pe ty22pe st23rongwveeryak weak
ty
pe
Data dependences, which relate statements that compute data values to statements that use those values, are useful for automating a variety of program-comprehension-related activities, such as reverse engineering, impact analysis, and debugging. Unfortunat
correspondstoadata-dependencetypeandrepresentsthepercentageofdata-dependencesforthattype.Thedatainthe gureillustratesthatdatadependencesfallpredominantlyintoonlythreetypes.Duatype1,duatype3,andduatype20occurmostfrequently:togetherthesetypesaccountforover98%ofthetotaldatadependences.Oftheremaining21typesofdatadependences,12typesoccurinmarginalnumbersandaccountfortheremaining2%ofthedatadependences.Theremaining9typesofdatadependencesdonotoccurinthesubjects;suchtypesarenotlistedalongthehorizontalaxisinFigure6.
Theresultsofthisstudyarepreliminaryinnature.AlthoughthedatainFigure6showstrendsinthedistributionofdata-dependencetypes,thescarcityofthedatapoints,preventsusfromdrawinganyconclu-sionsaboutthedistribution.Furtherexperimentationwithmoreanddiversesubjectswillhelpdetermineiftrends,suchasthefrequentoccurrenceofduatype20,persist.Thedatainthe gureshowsthatforover30%ofthedatadependences,nopathfromthede nitiontotheusecontainsarede nitionoftherelevantvariable.Thisresultisimportantbecauseitmeansthat,forthesedatadependences,statementcoverageguaranteesthecoverageofthecorrespondentdef-useassociations.
ThelastthreebarsinFigure6showthedistributionofdata-dependencetypesaccordingtoOstrandandWeyuker’sclassi cation[19].Accordingtotheirclassi cation,over88%ofthedatadependencesarestrong,andover11%ofthedatadependencesareveryweak.Theremaining1%ofthedatadependencesareweak;nodatadependenceis rm.
4ApplicationsoftheData-DependenceClassi cation
Theabilitytoclassifydatadependencescanbeexploitedfordi erentapplications.Forexample,datadependencesthatareorderedbasedontheir“strength”canguideadata- owtestingstrategy[9],canbeusedtoperformimpactanalysesfocusedondi erentkindsofdependences,andcanbeanalyzedtoidentifypartsofthecodewherepossiblyunforeseendatadependencesrequirecarefulsoftwareinspections.Inshort,allactivitiesthatdependondata-dependenceinformationcanutilizesuchaclassi cation.Inthispaper,wefocusonanapplicationthatisrelatedtoprogramunderstanding—programslicing.Inthefollowingsection,wede neaslicingtechniquethatletsuscomputeslicesbasedondata-dependencetypes;wealsoillustrateacasestudyinwhichweapplythetechnique.
4.1Programslicing
Traditionalslicingtechniques(e.g.[12,14,27])includeinthesliceallstatementsthata ecttheslicingcriteriathroughdirectortransitivecontrolanddatadependences.Suchtechniquescomputetheslicebycomputingthetransitiveclosureofallcontrolandalldatadependencesstartingattheslicingcriterion.Theclassi cationofdatadependencesintodi erenttypesleadstoanewparadigmforslicing,inwhichthetransitiveclosureisperformedoveronlythespeci edtypesofdatadependences,ratherthanoveralldatadependences.Inthisslicingparadigm,aslicingcriterionisatriple<s,V,T>,wheresisaprogrampoint,Visasetofprogramvariablesreferencedats,andTidenti esoneormoretypesofdatadependences.Thesliceincludesthosestatementsthatmaya ectthevalueofthevariablesinVatsthroughtransitivecontrolorspeci edtypesofdatadependences.
Usingthenewslicingparadigm,wede neaslicingtechniquethatincreasesthescopeofasliceincre-mentallybyincludingdatadependencesofdi erenttypes.Thetechniquestartsbyconsideringthestronger