The Confounding Effect of Class Size on The Validity of Obje(10)

2021-04-05 08:35

de l’information

associated with fault-proneness. The NMO and NMA metrics were found to be associated with fault-

proneness, but the evidence for the SIX metric is more equivocal. The LCOM cohesion metric also has

equivocal evidence supporting its validity.

It should be noted that the differences in the results obtained across studies may be a consequence of

the measurement of different dependent variables. For instance, some treat the dependent variable as

the (continuous) number of defects found. Other studies use a binary value of incidence of a fault during

testing or in the field, or both. It is plausible that the effects of product metrics may be different for each of

these.

An optimistic observer would conclude that the evidence as to the predictive validity of most of these

metrics is good enough to recommend their practical usage.

2.2 The Confounding Effect of Size

In this section we take as a starting point the stance of an optimistic observer and assume that there is

sufficient empirical evidence demonstrating the relationship between the object-oriented metrics that we

study and fault-proneness. We already showed that previous empirical studies drew their conclusions

from univariate analyses. Below we make the argument that univariate analyses ignore the potential

confounding effects of class size. We show that if there is indeed a size confounding effect, then

previous empirical studies could have harbored a large positive bias.

For ease of presentation we take as a running example a coupling metric as the main metric that we are

trying to validate. For our purposes, a validation study is designed to determine whether there is an

association between coupling and fault-proneness. Furthermore, we assume that this coupling metric is

appropriately dichotomized: Low Coupling (LC) and High Coupling (HC). This dichotomization

assumption simplifies the presentation, but the conclusions can be directly generalized to a continuous

metric.

2.2.1 The Case Control Analogy

An object-oriented metrics validation study can be easily seen as an unmatched case-control study.

Case-control studies are frequently used in epidemiology to, for example, study the effect of exposure to12carcinogens on the incidence of cancers [95][12]. The reason for using case-control studies as opposed

to randomized experiments in certain instances is that it would not be ethically and legally defensible to

do otherwise. For example, it would not be possible to have deliberately composed ‘exposed’ and

‘unexposed’ groups in a randomized experiment when the exposure is a suspected carcinogen or toxic

substance. Randomized experiments are more appropriately used to evaluate treatments or preventative

measures [52].

In applying the conduct of a case-control study to the validation of an object-oriented product metric, one

would first proceed by identifying classes that have faults in them (the cases). Then, for the purpose of

comparison another group of classes without faults in them are identified (the controls). We determine

the proportion of cases that have, say High Coupling and the proportion with Low Coupling. Similarly, we

determine the proportion of controls with High Coupling, and the proportion with Low Coupling. If there is

an association of coupling with fault-proneness then the prevalence of High Coupling classes would be

higher in the cases than in the controls. Effectively then, a case-control study follows a paradigm that

proceeds from effect to cause, attempting to find antecedents that lead to faults [99]. In a case-control

study, the control group provides an estimate of the frequency of High Coupling that would be expected

among the classes that do not have faults in them.

In an epidemiological context, it is common to have ‘hospital-based cases’ [52][95]. For example, a

subset or all patients that have been admitted to a hospital with a particular disease can be considered as13cases. Controls can also be selected from the same hospital or clinic. The selection of controls is not

necessarily a simple affair. For example, one can match the cases with controls on some confounding12

13 Other types of studies that are used are cohort-studies [52], but we will not consider these here. This raises the issue of generalizability of the results. However, as noted by Breslow and Day [12], generalization from the sample

in a case-control study depends on non-statistical arguments. The concern with the design of the study is to maximize internal

validity. In general, replication of results establishes generalizability [79].

共19页:

The Confounding Effect of Class Size on The Validity of Obje(10).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档