Site menu:

Switch to German language

Computation of Effect Sizes

Statistical significance means that a result may not be the cause of random variations within the data. But not every significant result refers to an effect with a high impact, resp. it may even describe a phenomenon that is not really perceivable in everyday life. Statistical significance mainly depends on the sample size, the quality of the data and the power of the statistical procedures. If large data sets are at hand, as it is often the case f. e. in epidemiological studies or in large scale assessments, very small effects may reach statistical significance. In order to describe, if effects have a relevant magnitude, effect sizes are used to describe the strength of a phenomenon. The most popular effect size measure surely is Cohen's d (Cohen, 1988).

Here you will find a number of online calculators for the computation of different effect sizes and an interpretation table at the bottom of this page:

  1. Comparison of groups with equal size
  2. Comparison of groups with different sample size
  3. Effect size for pre-post-intervention studies with the correction of pretest differences
  4. Calculation of d and r from the test statistics of dependent t-tests
  5. Computation of d from the F-value of Analyses of Variance (ANOVA)
  6. Calculation of effect sizes from ANOVAs with multiple groups, based on group means
  7. Increase of success through intervention: The Binomial Effect Size Display (BESD)
  8. Risk Ratio, Odds Ratio and Risk Difference
  9. Effect size for the difference between two correlations
  10. Computation of the pooled standard deviation
  11. Transformation of the effect sizes r, d, f, Odds Ratioand eta square
  12. Computation of the effect sizes d, r and η2 from χ2- and z test statistics
  13. Table for interpreting the magnitude of d, r and eta square according to Hattie (2009) and Cohen (1988)

1. Comparison of groups with equal size (Cohen's d)

If the two groups have the same n, then the effect size is simply calculated by subtracting the means and dividing the result by the pooled standard deviation. The resulting effect size is called dCohen and it represents the difference between the groups in terms of their common standard deviation. It is used f. e. for calculating the effect for pre-post comparisons in single groups.

Group 1 Group 2
Mean
Standard Deviation
Effect Size dCohen


2. Comparison of groups with different sample size (Cohen's d)

Analogously, the effect size can be computed for groups with different sample size, by adjusting the calculation of the pooled standard deviation with weights for the sample sizes.

Group 1 Group 2
Mean
Standard Deviation
Sample Size (N)
Effect Size dCohen


3. Effect size for mean differences of groups with unequal sample size within a pre-post design
   (dcorr sensu Klauer, 2001)

Intervention studies usually compare at least an intervention and a control group, as well as a measurement prior and post to the intervention. The following calculator follows the suggestions of Klauer (2001), who controlled for different sample sizes and pre test differences. The downside to this approach: The pre-post-tests are not treated as repeated measures but as independent data. For dependent tests, you can use calculator 4 or transform eta square from repeated measures in order to account for dependences between measurement points.

Group 1 Group 2
Pre Post Pre Post
Mean
Standard Deviation
Sample Size (N)
Effect Size dcorr


4. Calculation of d and r from the test statistics of dependent t-tests

Dependent testing usually yields a higher power, because the interconnection between data points of different measurements are kept. This may be relevant f. e. when testing the same persons repeatedly, or when analyzing test results from matched persons or twins. Accordingly, more information may be used when computing effect sizes. The following online calculator uses the test statistics of an dependent t test and the degrees of freedom.

t Value
Degrees of Freedom
Effect Size d
Effect Size r


5. Computation of d from the F-value of Analyses of Variance (ANOVA)

A very easy to interpret effect size from analyses of variance (ANOVAs) is η2 that reflects the explained proportion variance of the total variance. This proportion may be transformed directly into d. If η2 is not available, the F value of the ANOVA can be used as well, as long as the sample size is known. The following computation only works for ANOVAs with two distinct groups (df1 = 1; Thalheimer & Cook, 2002):

F-Value
Sample Size of the Treatment Group
Sample Size of the Controll Group
Effect Size d


6. Calculation of effect sizes from ANOVAs with multiple groups, based on group means

In case, the groups means are known from ANOVAs with multiple groups, it is possible to compute the effect sizes f and d (Cohen, 1988, S. 273 ff.). Prior to computing the effect size, you have to determine the minimum and maximum mean and to calculate the deviation of means manually (a. compute the differences between the single means, b. square the differences and sum them up, c. divide the sum by the number of means, d. draw the square root).

Additionally, you have to decide, which scenario fits the data best:

  1. Please choose 'minimum deviation', if the group means are distributed close to the total mean.
  2. Please choose 'intermediate deviation', if the means are evenly distributed.
  3. Please choose 'maximum deviation', if the means are distributed mainly towards the extremes and not in the center of the range of means.

Highest Mean (mmax)
Lowest Mean (mmin)
Deviation of Means
Number of Groups
Distribution of Means
Effect Size f
Effect Size d


7. Increase of intervention success: The Binomial Effect Size Display (BESD)

Measures of effect size like d or correlations can be hard to communicate, e. g. to patients. If you use r2 f. e., effects seem to be really small and when a person does not know or understand the interpretation guidelines, even effective interventions could be seen as futile. And even small effects can be very important, as Hattie (2007) underlines:

Rosenthal and Rubin (1982) suggest another way of looking on the effects of treatements by considering the increase of success through interventions. The approach is suitable for 2x2 contingency tables with the different treatment groups in the rows and the number of cases in the columns. The BESD is computed by subtracting the probability of success from the intervention an the controll group. The resulting percentage can be transformed into dCohen.

Please fill in the number of cases with a fortunate and unfortunate outcome in the different cells:

Success Failure Probability of Success
Intervention group
Control Group
Binomial Effect Size Display (BESD)
(Increase of Intervention Success)
rPhi
Effect Size dcohen


8. Risk Ratio, Odds Ratio and Risk Difference

Studies, investigating if specific incidences occur (e. g. death, healing, academic success ...) on a binary basis (yes versus no), and if two groups differ in respect to these incidences, usually Odds Ratios, Risk Ratios and Risk Differences are used to quantify the differences between the groups (Borenstein et al. 2009, chap. 5). These forms of effect size are therefore commonly used in clinical research:

When doing metaanalytic research, please use LogRiskRatio or LogOddsRatio when aggregating data and delogarithmize the sum finally.

Incidence no Incidence N
Teatment
Controll

Risk Ratio Odds Ratio Risk Difference
Result
Log
Estimated Variance V
VLogRiskRatio

VLogOddsRatio

VRiskDifference
Estimated Standard Error SE
SELogRiskRatio

SELogOddsRatio

SERiskDifference




9. Effect size for the difference between two correlations

Cohen (1988, S. 109) suggests an effect size measure with the denomination q that permits to interpret the difference between two correlations. The two correlations are transformed with Fisher's Z and subtracted afterwards. Cohen proposes the following categories for the interpretation: <.1: no effect; .1 to .3: small effect; .3 to .5: intermediate effect; >.5: large effect.

Correlation r1
Correlation r2
Cohen's q
Interpretation


10. Computation of the pooled standard deviation

In order to compute Conhen's d, it is necessary to determine the mean (pooled) standard deviation. Here, you will find a small tool that does this for you. Different sample sizes are corrected as well:

Group 1 Group 2
Standard Deviation
Sample size (N)
Pooled Standard Deviation spool


11. Transformation of the effect sizes d, r, f, Odds Ratio and η2

Please choose the effect size, you want to transform, in the drop-down menu. Specify the magnitude of the effect size afterwards. The transformation is done according to Cohen (1988), Rosenthal (1994, S. 239) and Borenstein, Hedges, Higgins, und Rothstein (2009; transformation of d in Odds Ratios).

Effektstärke
d
r
η2
f
Odds Ratio


12. Computation of the effect sizes d, r and η2 from χ2- and z test statistics

The χ2 and z test statistics from hypothesis tests can be used to compute d and r(Rosenthal & DiMatteo, 2001, p. 71; comp. Elis, 2010, S. 28). The calculation is however only correct for χ2 tests with one degree of freedom. Please choose the tests static measure from the drop-down menu und specify the value and N. The transformation from d to r and η2 is based on the formulas used in the prior section.

Prüfgröße
N
d
r
η2


13. Table of interpretation for different effect sizes

Here, you can see the suggestions of Cohen (1988) and Hattie (2009 S. 97) for interpreting the magnitude of effect sizes. Hattie refers to real educational contexts and therefore uses a more benignant classification, compared to Cohen. We slightly adjusted the intervals, in case, the interpretation did not exactly match the categories of the original authors.

d r* η2 Interpretation sensu Cohen (1988) Interpretation sensu Hattie (2007)
< 0 < 0 - Adverse Effect
0.0 .00 .000 No Effect Developmental effects
0.1 .05 .003
0.2 .10 .010 Small Effect Teacher effects
0.3 .15 .022
0.4 .2 .039 Zone of desired effects
0.5 .24 .060 Intermediate Effect
0.6 .29 .083
0.7 .33 .110
0.8 .37 .140 Large Effect
0.9 .41 .168
≥ 1.0 .45 .200

* Cohen (1988) reports the following intervals for r: .1 to .3: small effect; .3 to .5: intermediate effect; .5 and higher: strong effect




Literature

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to Meta-Analysis, Chapter 7: Converting Among Effect Sizes . Chichester, West Sussex, UK: Wiley.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2. Auflage). Hillsdale, NJ: Erlbaum.

Elis, P. (2010). The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results. Cambridge: Cambridge University Press.

Hattie, J. (2009). Visible Learning. London: Routledge.

Klauer, K. J. (2001). Handbuch kognitives Training. Göttingen: Hogrefe.

Rosenthal, R. (1994). Parametric measures of effect size. In H. Cooper & L. V. Hedges (Eds.), The Handbook of Research Synthesis (231-244). New York, NY: Sage.

Rosenthal, R. & DiMatteo, M. R. (2001). Meta-Analysis: Recent Developments in Quantitative Methods for Literature Reviews. Annual Review of Psychology, 52(1), 59-82. doi:10.1146/annurev.psych.52.1.59

Thalheimer, W., & Cook, S. (2002, August). How to calculate effect sizes from published research articles: A simplified methodology. Retrieved March 9, 2014 from http://work-learning.com/effect_sizes.htm.