Warning: mkdir(): Permission denied in /home/virtual/lib/view_data.php on line 81

Warning: fopen(upload/ip_log/ip_log_2024-12.txt): failed to open stream: No such file or directory in /home/virtual/lib/view_data.php on line 83

Warning: fwrite() expects parameter 1 to be resource, boolean given in /home/virtual/lib/view_data.php on line 84
Statistical notes for clinical researchers: the independent samples t-test
Skip Navigation
Skip to contents

Restor Dent Endod : Restorative Dentistry & Endodontics

OPEN ACCESS

Articles

Page Path
HOME > Restor Dent Endod > Volume 44(3); 2019 > Article
Open Lecture on Statistics Statistical notes for clinical researchers: the independent samples t-test
Hae-Young Kimorcid

DOI: https://doi.org/10.5395/rde.2019.44.e26
Published online: July 17, 2019

Department of Health Policy and Management, College of Health Science, andDepartment of Public Health Science, Graduate School, Korea University, Seoul,Korea.

Correspondence to Hae-Young Kim, DDS, PhD. Professor, Department of Health Policy and Management, Korea University College of Health Science, and Department of Public Health Science, Korea University Graduate School, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Korea. kimhaey@korea.ac.kr

Copyright © 2019. The Korean Academy of Conservative Dentistry

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

  • 44 Views
  • 2 Download
prev next
The t-test is frequently used in comparing 2 group means. The compared groups may be independent to each other such as men and women. Otherwise, compared data are correlated in a case such as comparison of blood pressure levels from the same person before and after medication (Figure 1). In this section we will focus on independent t-test only. There are 2 kinds of independent t-test depending on whether 2 group variances can be assumed equal or not. The t-test is based on the inference using t-distribution.
The t-distribution was invented in 1908 by William Sealy Gosset, who was working for the Guinness brewery in Dublin, Ireland. As the Guinness brewery did not permit their employee's publishing the research results related to their work, Gosset published his findings by a pseudonym, “Student.” Therefore, the distribution he suggested was called as Student's t-distribution. The t-distribution is a distribution similar to the standard normal distribution, z-distribution, but has lower peak and higher tail compared to it (Figure 2).
According to the sampling theory, when samples are drawn from a normal-distributed population, the distribution of sample means is expected to be a normal distribution. When we know the variance of population, σ2, we can define the distribution of sample means as a normal distribution and adopt z-distribution in statistical inference. However, in reality, we generally never know σ2, we use sample variance, s2, instead. Although the s2 is the best estimator for σ2, the degree of accuracy of s2 depends on the sample size. When the sample size is large enough (e.g., n = 300), we expect that the sample variance would be very similar to the population variance. However, when sample size is small, such as n = 10, we could guess that the accuracy of sample variance may be not that high. The t-distribution reflects this difference of uncertainty according to sample size. Therefore the shape of t-distribution changes by the degree of freedom (df), which is sample size minus one (n − 1) when one sample mean is tested.
The t-distribution appears to be a family of distribution of which shape varies according to its df (Figure 2). When df is smaller, the t-distribution has lower peak and higher tail compared to those with higher df. The shape of t-distribution approaches to z-distribution as df increases. When df gets large enough, e.g., n = 300, t-distribution is almost identical with z-distribution. For the inferences of means using small samples, it is necessary to apply t-distribution, while similar inference can be obtain by either t-distribution or z-distribution for a case with a large sample. For inference of 2 means, we generally use t-test based on t-distribution regardless of the sizes of sample because it is always safe, not only for a test with small df but also for that with large df.
To adopt z- or t-distribution for inference using small samples, a basic assumption is that the distribution of population is not significantly different from normal distribution. As seen in Appendix 1, the normality assumption needs to be tested in advance. If normality assumption cannot be met and we have a small sample (n < 25), then we are not permitted to use ‘parametric’ t-test. Instead, a non-parametric analysis such as Mann-Whitney U test should be selected.
For comparison of 2 independent group means, we can use a z-statistic to test the hypothesis of equal population means only if we know the population variances of 2 groups, σ12rde-44-e26-i002.jpg and σ22rde-44-e26-i003.jpg, as follows;
(Eq. 1)
Z=X1X2σ12n1+σ22n2
where 1 and 2, σ12rde-44-e26-i002.jpg and σ22rde-44-e26-i003.jpg, and n1 and n2 are sample means, population variances, and the sizes of 2 groups.
Again, as we never know the population variances, we need to use sample variances as their estimates. There are 2 methods whether 2 population variances could be assumed equal or not. Under assumption of equal variances, the t-test devised by Gosset in 1908, Student's t-test, can be applied. The other version is Welch's t-test introduced in 1947, for the cases where the assumption of equal variances cannot be accepted because quite a big difference is observed between 2 sample variances.
1. Student's t-test
In Student's t-test, the population variances are assumed equal. Therefore, we need only one common variance estimate for 2 groups. The common variance estimate is calculated as a pooled variance, a weighted average of 2 sample variances as follows;
(Eq. 2)
sp2=(n11)n11+(n21)s12+(n21)n11+(n21)s22
where s12rde-44-e26-i004.jpg and s22rde-44-e26-i005.jpg are sample variances.
The resulting t-test statistic is a form that both the population variances, σ12rde-44-e26-i002.jpg and σ12rde-44-e26-i003.jpg, are exchanged with a common variance estimate, sp2rde-44-e26-i006.jpg. The df is given as n1 + n2 − 2 for the t-test statistic.
(Eq. 3)
t=X1X2sp2n1+sp2n2=X1X2sp1n1+1n2~t(n1+n22)
In Appendix 1, ‘(E-1) Leven's test for equality of variances’ shows that the null hypothesis of equal variances was accepted by the high p value, 0.334 (under heading of Sig.). In ‘(E-2) t-test for equality of means t-values’, the upper line shows the result of Student's t-test. The t-value and df are shown −3.357 and 18. We can get the same figures using the formulas Eq. 2 and Eq. 3, and descriptive statistics in Table 1, as follows.
sp2=101×0.59782+101×0.45902101+(101)=5.112418=0.2840=(0.5329)2
t=X1X2sp1n1+1n2=10.2811.080.532919+19=0.800.2512=3.18
df = n1 + n2 − 2 = 10 + 10 − 2 = 18
The result of calculation is a little different from that by SPSS (IBM Corp., Armonk, NY, USA) of Appendix 1, maybe because of rounding errors.
2. Welch's t-test
Actually there are a lot of cases where the equal variance cannot be assumed. Even if it is unlikely to assume equal variances, we still compare 2 independent group means by performing the Welch's t-test. Welch's t-test is more reliable when the 2 samples have unequal variances and/or unequal sample sizes. We need to maintain the assumption of normality.
Because the population variances are not equal, we have to estimate them separately by 2 sample variances, s12rde-44-e26-i004.jpg and s22rde-44-e26-i005.jpg. As the result, the form of t-test statistic is given as follows;
(Eq. 4)
t=X1X2s12n1+s22n2~tν
where ν is Satterthwaite degrees of freedom.
(Eq. 5)
ν=s12n1+s22n2s12n12/n11+s22n22/n21
In Appendix 1, ‘(E-1) Leven's test for equality of variances’ shows an equal variance can be successfully assumed (p = 0.334). Therefore, the Welch's t-test is inappropriate for this data. Only for the purpose of exercise, we can try to interpret the results of Welch's t-test shown in the lower line in ‘(E-2) t-test for equality of means t-values’. The t-value and df are shown as −3.357 and 16.875.
t=X1X2s12n1+s22n2=10.2811.080.5978210+0.4590210=0.800.2383=3.357
υ=(s12n1+s22n2)2s12n12/n11+s22n22/n21=(0.5978210+0.4590210)20.59782102101+0.45902102101=0.0567920.0001419+0.0000493=.0032250.000191216.87
We've confirmed nearly same results by calculation using the formula and by SPSS software.
The t-test is one of frequently used analysis methods for comparing 2 group means. However, sometimes we forget the underlying assumptions such as normality assumption or miss the meaning of equal variance assumption. Especially when we have a small sample, we need to check normality assumption first and make a decision between the parametric t-test and the nonparametric Mann-Whitney U test. Also, we need to assess the assumption of equal variances and select either Student's t-test or Welch's t-test.
Appendix 1

Procedure of t-test analysis using IBM SPSS

The procedure of t-test analysis using IBM SPSS Statistics for Windows Version 23.0 (IBM Corp., Armonk, NY, USA) is as follows.
rde-44-e26-a001.jpg
Figure 1

Types of 2-sample t-test.

rde-44-e26-g001.jpg
Figure 2

The t-distribution with various degrees of freedom (df) compared to z-distribution.

rde-44-e26-g002.jpg
Table 1

Descriptive statistics and result of the Student's t-test

Group No. Mean Standard deviation p value
1 10 10.28 0.5978 0.004
2 10 11.08 0.4590

Tables & Figures

REFERENCES

    Citations

    Citations to this article as recorded by  

      • ePub LinkePub Link
      • Cite
        CITE
        export Copy Download
        Close
        Download Citation
        Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

        Format:
        • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
        • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
        Include:
        • Citation for the content below
        Statistical notes for clinical researchers: the independent samples t-test
        Restor Dent Endod. 2019;44(3):e26  Published online July 17, 2019
        Close
      • XML DownloadXML Download
      Figure
      • 0
      • 1
      Statistical notes for clinical researchers: the independent samples t-test
      Image Image
      Figure 1 Types of 2-sample t-test.
      Figure 2 The t-distribution with various degrees of freedom (df) compared to z-distribution.
      Statistical notes for clinical researchers: the independent samples t-test

      Descriptive statistics and result of the Student's t-test

      GroupNo.MeanStandard deviationp value
      11010.280.59780.004
      21011.080.4590
      Table 1 Descriptive statistics and result of the Student's t-test


      Restor Dent Endod : Restorative Dentistry & Endodontics
      Close layer
      TOP