← return to docs

Effect Size Types and Their Normalization

Effect Size TypeAbbreviation in DatabaseConvertible to rr?
Cohen's ddYes
Hedges' ggYes
Odds RatioORYes
Hazard RatioHRYes
Risk RatioRRYes (approximate)
Eta SquaredetasqYes
Partial Eta Squaredpartial etasqYes (approximate)
Cohen's ffYes
Cohen's f²Yes
R SquaredYes
Phi CoefficientphiYes
Pearson CorrelationrYes (already rr)
t-testtYes
F-testFYes
z-testzYes
Chi-squaredχ²Yes
Incidence Rate DifferenceIRDNo
Glass' deltaGlass' deltaNo
Cliff's deltaCliff's deltaNo
Cohen's wwNo
Regression coefficient (standardized)βNo
Regression coefficient (unstandardized)bNo
Probability DifferencePDNo
Cohen's dzd_z (paired)dzNo
Log Ratio of Means (signed)log ROMNo
Spearman's rank correlationSpearman's rNo

The Metascience Observatory's replications database contains a wide variety of reported effect size types. To achieve commensurability between these types we convert them into an equivalent or approximate Pearson correlation coefficient (rr) when possible. This converts effect sizes to a 0 to 1 scale. Not all effect size types can be converted this way, but many can.

To consistently show reversals in effect magnitude as negatives, we always report the original effect as being positive. The replication effect sizes are then coded with a sign reflecting whether they match the original direction (positive) or reverse it (negative).


Cohen's d

Cohen's dd gives a standardized measure of the difference between two group's means (Cohen, 1988). It is defined as:

d=M1M2SDpooledd = \frac{M_1 - M_2}{SD_{pooled}}

Where:

  • M1,M2M_1, M_2: The means of the two groups.
  • SDpooledSD_{pooled}: The pooled standard deviation of the two groups.

Normalization to 0–1 Scale (Conversion to rr)

The standard conversion formula used is (Borenstein et al., 2009, p. 48):

r=dd2+(n1+n2)2n1n2r = \frac{d}{\sqrt{d^2 + \frac{(n_1 + n_2)^2}{n_1 n_2}}}

Note: If sample sizes are equal (n1=n2n_1 = n_2), this simplifies to the commonly seen approximation r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}} (Cohen, 1988).


Hedges' g

Hedges' gg is a bias-corrected version of Cohen's dd that adjusts for the slight upward bias of dd in small samples (Hedges, 1981). It is defined as:

g=Jdg = J \cdot d

Where:

  • dd: Cohen's dd.
  • JJ: The correction factor, J=134df1J = 1 - \frac{3}{4 \cdot df - 1}, where df=n1+n22df = n_1 + n_2 - 2.

Normalization to 0–1 Scale (Conversion to rr)

Because gg is on the same scale as dd, the same conversion formula is used (Borenstein et al., 2009, p. 48):

r=gg2+(n1+n2)2n1n2r = \frac{g}{\sqrt{g^2 + \frac{(n_1 + n_2)^2}{n_1 n_2}}}

Note: If sample sizes are equal (n1=n2n_1 = n_2), this simplifies to r=gg2+4r = \frac{g}{\sqrt{g^2 + 4}}.


Odds Ratio (OR)

The Odds Ratio measures the association between an exposure and an outcome, representing the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure.

OR=p1/(1p1)p2/(1p2)OR = \frac{p_1 / (1 - p_1)}{p_2 / (1 - p_2)}

Where:

  • p1p_1: The probability of the event in the first group (e.g., treatment group).
  • p2p_2: The probability of the event in the second group (e.g., control group).

Normalization to 0–1 scale

This is a two-step process where the Log Odds Ratio is first converted to Cohen's dd, and then to rr (Chinn, 2000):

  1. Convert to dd: d=ln(OR)3πd = \frac{\ln(OR) \cdot \sqrt{3}}{\pi}
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

Hazard Ratio (HR)

The Hazard Ratio is a measure of effect size commonly used in survival analysis (e.g., Cox proportional hazards regression). It represents the ratio of the hazard rates between two groups over time.

HR=h1(t)h2(t)HR = \frac{h_1(t)}{h_2(t)}

Where:

  • h1(t)h_1(t): The hazard rate in the first group (e.g., treatment group) at time tt.
  • h2(t)h_2(t): The hazard rate in the second group (e.g., control group) at time tt.

Normalization to 0–1 Scale (Conversion to rr)

The Hazard Ratio is converted using the same formula as the Odds Ratio. This approximation is most accurate when the event rate is low (< 10-15%) or follow-up time is short, conditions under which HR ≈ OR (Chinn, 2000):

  1. Convert to dd: d=ln(HR)3πd = \frac{\ln(HR) \cdot \sqrt{3}}{\pi}
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

Note: This conversion is an approximation. For common events or long follow-up periods, HR and OR can diverge, making the conversion less precise.


Risk Ratio (RR)

The Risk Ratio (also called Relative Risk) measures the ratio of the probability of an event occurring in an exposed group versus the probability in an unexposed group. It is commonly estimated from cohort studies or count-based regression models (e.g., Poisson or negative binomial regression).

RR=p1p2RR = \frac{p_1}{p_2}

Where:

  • p1p_1: The probability (or rate) of the event in the first group (e.g., exposed group).
  • p2p_2: The probability (or rate) of the event in the second group (e.g., unexposed group).

Normalization to 0–1 Scale (Conversion to rr)

The Risk Ratio is converted using the same log-based formula as the Odds Ratio and Hazard Ratio. This approximation is most accurate when event rates are low, a condition under which RR ≈ OR.

  1. Convert to dd: d=ln(RR)3πd = \frac{\ln(RR) \cdot \sqrt{3}}{\pi}
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

Note: When event rates are high, RR and OR diverge (RR is always closer to 1.0 than OR for the same data), making the conversion less precise. For rare events (< 10%), RR ≈ OR and the approximation is good.


Eta Squared (η2\eta^2)

Eta squared is a measure of effect size in analysis of variance (ANOVA) that represents the proportion of total variance in the dependent variable that is associated with the membership of different groups defined by an independent variable (Cohen, 1988).

η2=SSeffectSStotal\eta^2 = \frac{SS_{effect}}{SS_{total}}

Where:

  • SSeffectSS_{effect}: The sum of squares for the effect (between-groups).
  • SStotalSS_{total}: The total sum of squares.

Normalization to 0–1 Scale (Conversion to rr)

The conversion is a two-step process, first converting to Cohen's dd, then to rr (Cohen, 1988;

  1. Convert to dd: d=2η21η2d = 2\sqrt{\frac{\eta^2}{1 - \eta^2}}
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

Note: This is algebraically equivalent to r=η2r = \sqrt{\eta^2}, but the code implements the two-step conversion.


Partial Eta Squared (ηp2\eta^2_p)

Partial eta squared is a variant of eta squared commonly reported by statistical software (e.g., SPSS) in factorial ANOVA designs. Unlike eta squared, which divides by the total sum of squares, partial eta squared divides only by the sum of squares for the effect plus the error sum of squares, excluding variance attributable to other factors in the design.

ηp2=SSeffectSSeffect+SSerror\eta^2_p = \frac{SS_{effect}}{SS_{effect} + SS_{error}}

Where:

  • SSeffectSS_{effect}: The sum of squares for the effect of interest.
  • SSerrorSS_{error}: The sum of squares for the error term.

Normalization to 0–1 Scale (Conversion to rr)

The same conversion formula used for eta squared is applied:

  1. Convert to dd: d=2ηp21ηp2d = 2\sqrt{\frac{\eta^2_p}{1 - \eta^2_p}}
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

This is algebraically equivalent to r=ηp2r = \sqrt{\eta^2_p}.

Important Caveats: For effects with 1 numerator degree of freedom (i.e., two-group comparisons, which covers most replication studies), partial eta squared equals eta squared and the conversion is exact. In multi-factor ANOVA designs with more than 1 numerator df, partial eta squared removes variance from other factors from the denominator, so the resulting rr can be inflated compared to what a one-way design would yield. However, the Cambridge MRC Cognition and Brain Sciences Unit statistics wiki says that one can "convert a partial eta-squared to a Cohen's d by regarding the partial eta-squared as a squared correlation." At least for direct replication comparisons — where both the original and replication use the same design — this conversion is appropriate because any inflation applies equally to both studies, preserving the relative comparison.


Cohen's f

Cohen's ff is an effect size measure used commonly in the context of F-tests (ANOVA) and regression, representing the dispersion of means relative to the standard deviation (Cohen, 1988).

f=η21η2f = \sqrt{\frac{\eta^2}{1 - \eta^2}}

Where:

  • η2\eta^2: Eta squared (the proportion of variance explained).

Normalization to 0–1 Scale (Conversion to rr)

The conversion is a two-step process (Cohen, 1988):

  1. Convert to dd: d=2fd = 2f
  2. Convert to rr: r=dd2+4r = \frac{d}{\sqrt{d^2 + 4}}

Note: This is algebraically equivalent to r=f1+f2r = \frac{f}{\sqrt{1 + f^2}}.


Cohen's f² (f2f^2)

Cohen's f2f^2 is the squared version of Cohen's ff, commonly used in regression contexts to measure effect size (Cohen, 1988).

f2=R21R2f^2 = \frac{R^2}{1 - R^2}

Where:

  • R2R^2: The coefficient of determination.

Normalization to 0–1 Scale (Conversion to rr)

The conversion is a two-step process:

  1. Convert to R2R^2: R2=f21+f2R^2 = \frac{f^2}{1 + f^2}
  2. Convert to rr: r=R2r = \sqrt{R^2}

R Squared (R2R^2)

R2R^2 (the coefficient of determination) represents the proportion of the variance for a dependent variable that's explained by an independent variable or variables in a regression model.

R2=1SSresSStotalR^2 = 1 - \frac{SS_{res}}{SS_{total}}

Where:

  • SSresSS_{res}: The sum of squares of residuals (unexplained variance).
  • SStotalSS_{total}: The total sum of squares (total variance).

Normalization to 0–1 Scale (Conversion to rr)

The database normalizes this value by simply taking the square root:

r=R2r = \sqrt{R^2}


Phi Coefficient (ϕ\phi)

The Phi coefficient is a measure of association for two binary variables (Cramér, 1946).

ϕ=adbc(a+b)(c+d)(a+c)(b+d)\phi = \frac{ad - bc}{\sqrt{(a+b)(c+d)(a+c)(b+d)}}

Where:

  • a,b,c,da, b, c, d: The frequencies in a 2×22 \times 2 contingency table.

Normalization to 0–1 Scale (Conversion to rr)

No conversion is needed for the Phi coefficient, as it is already equivalent to the Pearson correlation coefficient calculated for binary data.

r=ϕr = \phi


Pearson Correlation (rr)

The Pearson correlation coefficient measures the linear correlation between two sets of data (Pearson, 1895).

r=(xixˉ)(yiyˉ)(xixˉ)2(yiyˉ)2r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}

Where:

  • xi,yix_i, y_i: Individual sample points.
  • xˉ,yˉ\bar{x}, \bar{y}: The sample means.

Normalization to 0–1 Scale

This metric serves as the target scale for the database, so no conversion is needed. As mentioned above, to maintain the "0 to 1" magnitude scale required by the database's coding scheme, original effect sizes are taken as their absolute value:

rcoded=rreportedr_{coded} = |r_{reported}|


Test Statistics

The database can also convert APA-formatted test statistics directly to rr (Rosenthal, 1991; Borenstein et al., 2009).

t-test

Format: t(df) = value (e.g., t(10) = 2.5)

Conversion to rr: r=tt2+dfr = \frac{t}{\sqrt{t^2 + df}}

Sign is preserved (negative t produces negative r).


F-test (df1 = 1 only)

Format: F(df1, df2) = value (e.g., F(1, 20) = 4.5)

Constraint: Only convertible when df1 = 1.

Conversion to rr:

  1. Convert F to t: t=Ft = \sqrt{F}
  2. Convert t to r: r=tt2+df2r = \frac{t}{\sqrt{t^2 + df_2}}

Always positive (F-tests are non-directional).


z-test

Format: z = value, N = value (e.g., z = 2.81, N = 34)

Conversion to rr: r=zz2+Nr = \frac{z}{\sqrt{z^2 + N}}

Sign is preserved.


Chi-squared (df = 1 only)

Format: χ2(1, N = value) = value or x2(1, N = value) = value (e.g., χ2(1, N = 12) = 5)

Constraint: Only convertible when df = 1.

Conversion to rr: r=χ2Nr = \sqrt{\frac{\chi^2}{N}}

Always positive.


Glass' delta

Glass's Δ\Delta (delta) is a standardized mean difference that uses only the control group's standard deviation as the denominator, rather than the pooled SD used by Cohen's dd (Glass, 1976).

Δ=M1M2SDcontrol\Delta = \frac{M_1 - M_2}{SD_{control}}

Where:

  • M1,M2M_1, M_2: The means of the two groups.
  • SDcontrolSD_{control}: The standard deviation of the control group only.

Why not converted to rr: The standard dd-to-rr conversion assumes a pooled standard deviation. Using only one group's SD introduces asymmetry that makes the conversion unreliable without additional information about group variance ratios.


Cliff's delta

Cliff's δ\delta is a non-parametric effect size that measures the degree of overlap between two distributions (Cliff, 1993). It represents the probability that a randomly selected observation from one group is larger than a randomly selected observation from the other, minus the reverse probability.

δ=#(xi>yj)#(xi<yj)n1n2\delta = \frac{\#(x_i > y_j) - \#(x_i < y_j)}{n_1 \cdot n_2}

Where:

  • xix_i: Observations from group 1.
  • yjy_j: Observations from group 2.
  • n1,n2n_1, n_2: The sample sizes of the two groups.
  • #(xi>yj)\#(x_i > y_j): The count of all pairwise comparisons where xix_i exceeds yjy_j.

Range: 1-1 to +1+1, where 00 indicates complete overlap.

Why not converted to rr: Cliff's delta is a non-parametric, ordinal-level measure with no distributional assumptions. Converting it to Pearson's rr (a parametric measure) would require assumptions about the underlying distributions that the statistic was specifically designed to avoid.


Cohen's w

Cohen's ww is an effect size measure for chi-squared tests of goodness-of-fit or independence (Cohen, 1988). It quantifies the discrepancy between observed and expected proportions.

w=i=1m(P1iP0i)2P0iw = \sqrt{\sum_{i=1}^{m} \frac{(P_{1i} - P_{0i})^2}{P_{0i}}}

Where:

  • P1iP_{1i}: The observed (or alternative hypothesis) proportion in category ii.
  • P0iP_{0i}: The expected (or null hypothesis) proportion in category ii.
  • mm: The number of categories.

Why not converted to rr: Cohen's ww applies to multi-category frequency comparisons and does not map onto the two-variable linear association that Pearson's rr measures. While w=ϕw = \phi in the special case of a 2×22 \times 2 table, the general case involves tables of arbitrary size.


Spearman's rank correlation

Spearman's rsr_s (rho) measures the monotonic relationship between two variables using their ranks rather than raw values (Spearman, 1904).

rs=16di2n(n21)r_s = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}

Where:

  • did_i: The difference between the ranks of the ii-th paired observation.
  • nn: The number of paired observations.

Range: 1-1 to +1+1, identical to Pearson's rr.

Why not converted to rr: Although Spearman's rsr_s is on the same numerical scale as Pearson's rr, it measures monotonic (not linear) association and is computed on ranks rather than raw values. Treating it as interchangeable with Pearson's rr in meta-analytic comparisons would conflate two distinct constructs.


Non-Convertible Effect Sizes

The following effect sizes cannot be reliably converted to rr and thus will not have an entry computed for the replication_es_r and original_es_r columns:

  • Incidence Rate Difference (IRD) — raw percentage-point differences between groups, on a scale of roughly −100 to +100, incompatible with the standardized 0–1 scale
  • Cramér's V
  • Cohen's h
  • Cohen's dzd_z (standardized mean difference for paired designs)
  • Cliff's delta
  • Cohen's w
  • Regression coefficients (bb, β\beta)
  • Semi-partial correlations (sr2sr^2)
  • Chi-squared with df > 1
  • Percentages

References

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. John Wiley & Sons.

Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin, 114(3), 494–509.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

Cramér, H. (1946). Mathematical methods of statistics. Princeton University Press.

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10), 3–8.

Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107–128.

Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.

Rosenthal, R. (1991). Meta-analytic procedures for social research (Rev. ed.). Sage Publications.

Chinn, S. (2000). A simple method for converting an odds ratio to effect size for use in meta-analysis. Statistics in Medicine, 19(22), 3127–3131.

Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology, 15(1), 72–101.