The Confounding Effect of Class Size on The Validity of Obje(12)

2021-04-05 08:35

de l’information

to fault-proneness. The path (b) depicts a positive causal relationship between size and fault-proneness.

The path (c) depicts a positive association between product metrics and size.

If this path diagram is concordant with reality, then size distorts the relationship between product metrics

and fault-proneness. Confounding can result in considerable bias in the estimate of the magnitude of the

association. Size is a positive confounder, which means that ignoring size will always result in the

association between say coupling and fault-proneness to be more positive than it really is.

The potential confounding effect of size can be demonstrated through an example (adapted from [12]).

Consider the table in Table 1 that gave an odds ratio of 22.9. As mentioned earlier, this is representative

of the current univariate analyses used in the object-oriented product metrics validation literature (which

explicitly exclude size as a covariate nor employ a stratification on size).

Now, let us say that if we analyze the data seperately for small and large classes, we have the data in15Table 2 for the large classes, and the data in Table 3 for the small classes.

Fault PronenessCouplingHC

LCFaulty9010Not Faulty91

Table 2: A contingency table showing the results for only large classes of a hypothetical validation study.

Fault PronenessCouplingHC

LCFaulty19Not Faulty1090

Table 3: A contingency table showing the results for only small classes of a hypothetical validation study.

In both of the above tables the odds ratio is one. By stratifying on size (i.e., controlling for the effect of

size), the association between coupling and fault-proneness has been reduced dramatically. This is

because size was the reason why there was an association between coupling and fault-proneness in the

first place. Once the influence of size is removed, the example shows that the impact of the coupling

metric disappears.

Therefore, an important improvement on the conduct of validation studies of object oriented metrics is to

control for the effect of size, otherwise one may be getting the illusion that the product metric is strongly

associated with fault-proneness, when in reality the association is much weaker or non-existent.

2.2.3 Evidence of a Confounding Effect

Now we must consider whether the path diagram in Figure 2 can be supported in reality.

There is evidence that object-oriented product metrics are associated with size. For example, in [22] the

Spearman rho correlation coefficients go as high as 0.43 for associations between some coupling and

cohesion metrics with size, and 0.397 for inheritance metrics, and both are statistically significant (at an

alpha level of say 0.1). Similar patterns emerge in the study reported in [19], where relatively large

correlations are shown. In another study [27] the authors display the correlation matrix showing the

Spearman correlation between a set of object-oriented metrics that can be collected from Shlaer-Mellor

designs and C++ LOC. The correlations range from 0.563 to 0.968, all statistically significant at an alpha

level 0.05. This also indicates very strong correlations with size.

Note that in this example the odds ratio of the size to fault-proneness association is 100, and the size to coupling association is

81.3. Therefore, it follows the model in Figure 2.15

共19页:

The Confounding Effect of Class Size on The Validity of Obje(12).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档