Warning: mkdir(): Permission denied in /home/virtual/lib/view_data.php on line 81

Warning: fopen(upload/ip_log/ip_log_2024-12.txt): failed to open stream: No such file or directory in /home/virtual/lib/view_data.php on line 83

Warning: fwrite() expects parameter 1 to be resource, boolean given in /home/virtual/lib/view_data.php on line 84
Statistical notes for clinical researchers: effect size
Skip Navigation
Skip to contents

Restor Dent Endod : Restorative Dentistry & Endodontics

OPEN ACCESS

Articles

Page Path
HOME > Restor Dent Endod > Volume 40(4); 2015 > Article
Open Lecture on Statistics Statistical notes for clinical researchers: effect size
Hae-Young Kim
Restorative Dentistry & Endodontics 2015;40(4):328-331.
DOI: https://doi.org/10.5395/rde.2015.40.4.328
Published online: October 2, 2015

Department of Health Policy and Management, College of Health Science, and Department of Public Health Sciences, Graduate School, Korea University, Seoul, Korea.

Correspondence to Hae-Young Kim, DDS, PhD. Associate Professor, Department of Health Policy and Management, College of Health Science, and Department of Public Health Sciences, Graduate School, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, Korea 02841. TEL, +82-2-3290-5667; FAX, +82-2-940-2879; kimhaey@korea.ac.kr

©Copyrights 2015. The Korean Academy of Conservative Dentistry.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

  • 27 Views
  • 0 Download
prev next
In most clinical studies, p value is the final result of data analysis. A small p value is interpreted as a significant difference between the experimental group and the control group. However, reporting p value is not enough to know the actual difference. Problem of p value is that it depends on the sample size, n. Even a trivial meaningless difference can result in an extremely small p value when sample size is large. To make up this weak point, we need to report the 'effect size' as well as the p value. Effect size is a simple way to show the actual difference, which is independent of the sample size.
In statistical testing we set a null hypothesis first and calculate the test statistic such as t values under an assumption of the null hypothesis. Finally, a p value is obtained which represents the probability of observing the current data due to chance when the null hypothesis is true. In most scientific articles, we usually make conclusion based on p values compared to the alpha error level chosen, e.g., 0.05. A smaller p value than alpha level is interpreted as a statistical significance. However, there are serious problems in relying on the p value only.
First, depending on the sample size, a wide range of p values can be obtained with the same size of difference, which can lead to contradictory results: either statistically significant or insignificant conclusions. Examples 1 and 2 in Table 1 have the same trivial difference of 3 between before and after treatments, assuming a clinically meaningful difference as 10. Two results were contradictory: statistically significant (p = 0.001, Example 2) and insignificant (p = 0.382, Example 1) depending on whether the sample size is large (n =10,000) or small (n =100). Moreover, as appeared in Example 2, it is a serious problem that clinically meaningless condition is concluded as statistically significant. The treatment in example 2 is clinically insignificant but statistically significant! What would you reasonably conclude on this case? This is a problem caused by using inappropriately large sample sizes.
Second, the information provided by the size of p value is confusing, because it is confounded by the sample size. We may expect that a small p value can tell us some information on how much difference exists between the observed data and the assumption of null hypothesis. However, the same size of p values can be obtained from quite different situations. Example 2 with a trivial effect and larger sample size and Example 3 with a substantial effect and smaller sample size both show the same p value 0.001 in Table 1. The result shows that p values are confounded with the sample size.
Two problems above can be overcome by controlling the sample size. To avoid this discordant situation, sample size determination procedure must be performed in the design stage in an experimental study. We generally need to calculate appropriate sample size in consideration with difference, SD, alpha error and power in the study design stage. The conclusion of significance testing is reliable only when an appropriate sample size was applied in a study. When we analyze a survey data with a large sample size, we need to consider the effect of large sample size in the interpretation of the test results.
Also the weakness of p value can be compensated by considering the effect size coincidently. As shown in Table 1, effect sizes exactly reflect the magnitude of actual effect, as displayed by 0.03 for a trivial difference and 0.3 for a substantial one.
'Effect size' is simply a way of quantifying difference between compared groups, in other words, the actual effect.1 While a p value has an important meaning in statistical inference, an effect size is expressing a descriptive importance. In Table 1, the effect sizes were expressed as the difference between two group means divided by the standard deviation of the group. When we compared Example 2 and Example 3, their effect sizes are a quite different as 0.03 and 0.3, while their p values are the same. Let's suppose clinicians generally think a change of at least '10' is clinically meaningful while a change of 3 after treatment is negligible. Therefore, they would not apply the treatment for the small change 3, even though the statistical significance test concluded the treatment is effective based on highly significant p value. Contrarily, they would apply the treatment in Example 3 because they can expect a substantial change of '30', and the statistical test concluded its significance. The results show that effect size exactly reflects the actual difference or effect. Therefore, reporting both the p value and the effect size is necessary in order to consider both statistical significance and actual clinical significance.
Generally, there are two types of common effect size indices: standardized difference between groups and measures of association between groups. Table 2 shows the types of effect size indices and general standards of small, medium, and large effect for each type of effect size.
  1. Between groups

    • 1) Cohen's d or Glass's Δ: Defined by difference between two group means divided by standard deviation for continuous outcomes. Cohen's d is calculated by dividing pooled standard deviation under assumption of the equal variances while Glass's Δ is obtained by dividing the standard deviation of control group.

    • 2) Odds ratio: Defined by ratio of odds of two compared groups for binary outcomes.

    • 3) Relative ratio: Defined by ratio of proportions of two compared groups for binary outcomes.

  2. Measures of association

    • 1) Pearson's r correlation: Effect size representing association of two variables.

    • 2) Pearson r correlation coefficient: The amount of variation explained.

Then, how would we interpret the degree of effect size? An effect size is exactly equivalent to a Z score of a standard normal distribution. Assume that all data are normally distributed. If Cohen's d is calculated to be zero, it means that there is no mean difference between two comparative groups and the position of the mean of experimental group is exactly the same with the mean of control group. Therefore, 50% of observations in control group locate below the mean of experimental group (Table 3). The relative 'small' effect size '0.2' means the mean of experimental group is located at 0.2 standard deviation above the mean of control group. The Z score of 0.2 is at 58th percentile which have 58% of observations below in control group (Figure 1). Similarly, the Cohen's d values 0.5 and 0.8 locate at 69th and 79th percentile of the distribution of the control group, respectively.
Pearson r correlation coefficient is an effect size which is widely understood and frequently used. Converting various statistic values including t or F into Pearson r correlation coefficient may be advantageous because it facilitates interpretation. Also Cohen's d can be converted into r. Table 4 provides the conversion formula and a brief explanation.
Though p values give information on statistical significance, they are confounded with the sample size. Effect size can make up the weak point, by providing information on the actual effect which is independent of the sample size. Therefore, reporting the effect size as well as the p value is recommended.
Figure 1

Distribution of control group (solid line) and experimental group (dotted line), and position of Cohen's d = 0.2.1

rde-40-328-g001.jpg
Table 1

Examples of results of significant testing using p value and comparative effect size

Example Before After SD* Diff. n t value p value Effect size Characteristics
1 145 142 100 3 100
0.3=3100/100
0.382
0.03=3100
Trivial effect & insignificant
2 145 142 100 3 10,000
3=3100/10,000
0.001
0.03=3100
Trivial effect & significant
3 145 115 100 30 100
3=3100/100
0.001
0.3=30100
Substantial effect & significant

*SD, standard deviation.

Table 2

Common effect size indices2

Index Description Standard Comment
Between groups Cohen's d or Glass's Δ d or Δ = (Mean1 - Mean2) / SD*
d: use pooled SD
Δ: use SD of control group
Small 0.2
Medium 0.5
Large 0.8
Very large 1.3
For continuous outcomes
Odds ratio (OR) OR = odds1 / odds2 Small 1.5
Medium 2
Large 3
Degree of association between binary outcomes
Relative risk or risk ratio (RR) RR = p1 / p2 Small 2
Medium 3
Large 4
For binary outcomes, ratio of two proportions
Measures of association Pearson's r correlation Range -1 to 1 Small ± 0.2
Medium ± 0.3
Large ± 0.5
Measures the degree of linear relationship
Pearson r correlation coefficient Range 0 to 1 Small 0.04
Medium 0.09
Large 0.25
Proportion of variance explained

*SD, standard deviation.

Table 3

Interpretation of Cohen's d which represents a standardized difference [(Mean1 - Mean2) / SD*]1,3

Relative size Effect size % of control group below the mean of experimental group
0.0 50%
Small 0.2 58%
Medium 0.5 69%
Large 0.8 79%
1.4 92%

*SD, standard deviation.

Table 4

Conversion from various statistics to Perason r correlation coefficient association measures3

Statistic Conversion formula Comment
χ2df = 1
r=χ2df=1N
A single degree of freedom chi-square value divided by the number of cases
t
r=t2t2+df
From t value to r correlation coefficient
F
r=Fdf=1,_Fdf=1,_+dferror
From F value with single freedom numerator to r
Cohen's d
r=d2d2+4
From Cohen's d to r

Tables & Figures

REFERENCES

    Citations

    Citations to this article as recorded by  

      • ePub LinkePub Link
      • Cite
        CITE
        export Copy Download
        Close
        Download Citation
        Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

        Format:
        • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
        • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
        Include:
        • Citation for the content below
        Statistical notes for clinical researchers: effect size
        Restor Dent Endod. 2015;40(4):328-331.   Published online October 2, 2015
        Close
      • XML DownloadXML Download
      Figure
      • 0
      Statistical notes for clinical researchers: effect size
      Image
      Figure 1 Distribution of control group (solid line) and experimental group (dotted line), and position of Cohen's d = 0.2.1
      Statistical notes for clinical researchers: effect size

      Examples of results of significant testing using p value and comparative effect size

      ExampleBeforeAfterSD*Diff.nt valuep valueEffect sizeCharacteristics
      114514210031000.3=3100/1000.3820.03=3100Trivial effect & insignificant
      2145142100310,0003=3100/10,0000.0010.03=3100Trivial effect & significant
      3145115100301003=3100/1000.0010.3=30100Substantial effect & significant

      *SD, standard deviation.

      Common effect size indices2

      IndexDescriptionStandardComment
      Between groupsCohen's d or Glass's Δd or Δ = (Mean1 - Mean2) / SD*
      d: use pooled SD
      Δ: use SD of control group
      Small 0.2
      Medium 0.5
      Large 0.8
      Very large 1.3
      For continuous outcomes
      Odds ratio (OR)OR = odds1 / odds2Small 1.5
      Medium 2
      Large 3
      Degree of association between binary outcomes
      Relative risk or risk ratio (RR)RR = p1 / p2Small 2
      Medium 3
      Large 4
      For binary outcomes, ratio of two proportions
      Measures of associationPearson's r correlationRange -1 to 1Small ± 0.2
      Medium ± 0.3
      Large ± 0.5
      Measures the degree of linear relationship
      Pearson r correlation coefficientRange 0 to 1Small 0.04
      Medium 0.09
      Large 0.25
      Proportion of variance explained

      *SD, standard deviation.

      Interpretation of Cohen's d which represents a standardized difference [(Mean1 - Mean2) / SD*]13

      Relative sizeEffect size% of control group below the mean of experimental group
      0.050%
      Small0.258%
      Medium0.569%
      Large0.879%
      1.492%

      *SD, standard deviation.

      Conversion from various statistics to Perason r correlation coefficient association measures3

      StatisticConversion formulaComment
      χ2df = 1r=χ2df=1NA single degree of freedom chi-square value divided by the number of cases
      tr=t2t2+dfFrom t value to r correlation coefficient
      Fr=Fdf=1,_Fdf=1,_+dferrorFrom F value with single freedom numerator to r
      Cohen's dr=d2d2+4 From Cohen's d to r
      Table 1 Examples of results of significant testing using p value and comparative effect size

      *SD, standard deviation.

      Table 2 Common effect size indices2

      *SD, standard deviation.

      Table 3 Interpretation of Cohen's d which represents a standardized difference [(Mean1 - Mean2) / SD*]13

      *SD, standard deviation.

      Table 4 Conversion from various statistics to Perason r correlation coefficient association measures3


      Restor Dent Endod : Restorative Dentistry & Endodontics
      Close layer
      TOP