The Confounding Effect of Class Size on The Validity of Obje(16)

2021-04-05 08:35

de l’information

In constructing our models, we could follow the previous literature and not consider interaction effects nor

consider any transformations (for example, see [4][8][17][18][19][22][106]). To err on the conservative

side, however, we did test for interaction effects between the size metric and the product metric for all

product metrics evaluated. In none of the cases was a significant interaction effect identified.19Furthermore, we performed a logarithmic transformation on our variables and re-evaluated all the20models. Our conclusions would not be affected by using the transformed models. Therefore, we only

present the detailed results for the untransformed model.

The magnitude of an association can be expressed in terms of the change in odds ratio as the x1 variable

changes by one standard deviation. This is explained in the appendix (Section 7), and is denoted by

Ψ. Since we construct two models as shown in Eqn. 2 and Eqn. 3 without and with controlling for size

respectively, we will denote the change in odds ratio as Ψx1 and Ψx1+x2 respectively. As suggested

in [74], we can evaluate the extent to which the change in odds ratio changes as an indication of theextent of confounding. We operationalize this as follows:

ψ=2 ψx1 ψx1+x2

ψx1+x2×100Eqn. 4

This gives the percent change in Ψx1+x2 by removing the size confounder. If this value is large then we

can consider that class size does indeed have a confounding effect. The definition of “large” can be

problematic, however, as will be seen in the results, the changes are sufficiently big in our study that by

any reasonable threshold, there is little doubt.

3.3.3 Diagnostics and Hypothesis Testing

The appendix of this paper presents the details of the model diagnostics that were performed, and the

approach to hypothesis testing. Here we summarize these.

The diagnostics concerned checking for collinearity and identifying influential observations. We compute

the condition number specific to logistic regression, ηLR, to determine whether dependencies amongst

the independent variables are affecting the stability of the model (collinearity). The β value provides us

an indication of which observations are overly influential. For hypothesis testing, we use the likelihood

ratio statistic, G, to test the significance of the overall model, the Wald statistic to test for the significance2of individual model parameters, and the Hosmer and Lemeshow R value as a measure of goodness of

fit. Note that for the univariate model the G statistic and the Wald test are statistically equivalent, but we

present them both for completeness. All statistical tests were performed at an alpha level of 0.05.

4 Results

4.1 Descriptive Statistics

Box and whisker plots for all the product metrics that we collected are shown in Figure 4. These indicatethth21the median, the 25 and 75 quantiles. Outliers and extreme points are also shown in the figure.

As is typical with product metrics their distributions are clearly heavy tailed. Most of the variables are

counts, and therefore their minimal value is zero. Variables NOC, NMO, and SIX have less than six

observations that are non-zero. Therefore, they were excluded from further analysis. This is the

approach followed in [22].19

20

21 Given that product metrics are counts, an appropriate transformation to stablize the variance would be the logarithm. We wish to thank an anonymous reviewer for making this suggestion. As will be noted that in some cases the minimal value is zero. For metrics such as CBO, WMC and RFC, this would be because

the class was defined in a manner similar to a C struct, with no methods associated with it.


The Confounding Effect of Class Size on The Validity of Obje(16).doc 将本文的Word文档下载到电脑 下载失败或者文档不完整,请联系客服人员解决!

下一篇:猪口蹄疫病毒(FMDV)Elisa试剂盒说明书

相关阅读
本类排行
× 注册会员免费下载(下载后可以自由复制和排版)

马上注册会员

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: