Sample size determination for conducting a pilot study to assess reliability of a questionnaire
Article information
Abstract
This article is a narrative review that discusses the recommended sample size requirements to design a pilot study to assess the reliability of a questionnaire. A list of various sample size tables that are based on the kappa agreement test, intra-class correlation test and Cronbach’s alpha test has been compiled together. For all calculations, type I error (alpha) was set at a maximum value of 0.05, and power was set at a minimum value of 80.0%. For the kappa agreement test, intra-class correlation test, and Cronbach’s alpha test, the recommended minimum sample size requirement based on the ideal effect sizes shall be at least 15, 22, and 24 subjects respectively. By making allowances for a non-response rate of 20.0%, a minimum sample size of 30 respondents will be sufficient to assess the reliability of the questionnaire. The clear guideline of minimum sample size requirement for the pilot study to assess the reliability of a questionnaire is discussed and this will ease researchers in preparation for the pilot study. This study provides justification for a minimum requirement of a sample size of 30 respondents specifically to test the reliability of a questionnaire.
INTRODUCTION
A pilot study is a preliminary investigation conducted before proceeding to the actual survey or experiment specifically designed to address many aspects of the research study. In survey research, a pilot study is mostly conducted to test the suitability of a study instrument before conducting an actual fieldwork phase by using the instrument. Besides that, a pilot study can also evaluate the overall performance characteristics of the chosen study design, study measures, research procedures, recruitment criteria, and various other operational strategies that are being considered for use in a subsequent, often much larger study [1].
It is usually very costly to conduct large survey research. Any major flaws that occur when conducting the full-scale survey are very likely to result in a waste of resources such as time, manpower, and money. Therefore, a pilot study shall serve as an important prerequisite step before undertaking the survey research by ensuring that it can feasible to conduct the main study with a realistic ability to address its research objectives. One of the main concerns for planning a pilot study is determining the minimum sample size requirements of a pilot study for assessing the reliability of the questionnaire.
Three common statistical tests are used to determine the reliability of the questionnaire: The kappa agreement test, intra-class correlation test, and Cronbach’s alpha test. Cohen’s kappa coefficient is a test statistic that determines the level of agreement between 2 different evaluations by a variable that is expressed in a categorical form. For test-retest reliability testing, an evaluation shall be performed by the same rater at 2 different times (say time 1 and time 2) usually with a lag time of about 1 to 2 weeks [23]. The goal of performing the test-retest reliability test is to determine to what extent the responses elicited at time 2 are agreeable with those responses elicited at time 1. The range of kappa’s coefficient values shall usually lie between −1 and 1 wherein a value of −1 indicates perfect disagreement and a value of +1 indicates perfect agreement [4].
Cohen’s kappa measures the level of agreement or its test-retest reliability for a variable that is expressed in a categorical form, whereas the intra-class correlation is a statistical test that can be used to measure test-retest reliability for a variable in a numerical form. Some other studies have also used Pearson’s correlation coefficients to measure the level of test-retest reliability [567]. However, it must be cautioned that it might not be appropriate to use a correlation test to examine the test-retest reliability because it can often be misleading to use Pearson’s correlation coefficient to examine test-retest reliability since the correlation test only provides a measurement of the correlation between 2 different ratings and it does not take into account the presence of any systematic biases in both ratings [8]. The range of values for which the intra-class correlation coefficient shall lie is between 0 and 1 wherein its value being equal to 0 indicates perfect disagreement and its value equal to 1 indicates perfect agreement [9].
Cronbach’s alpha is a measure of the internal consistency or reliability between several different items, measurements, or ratings. For questionnaire reliability testing, Cronbach’s alpha measures the internal consistency of a questionnaire or at least the domain(s) of a questionnaire. The response variable is usually measured in an interval form that is usually based on a Likert scale. The test was developed by Cronbach and was originally used to measure the reliability of a psychometric instrument [10]. The value of Cronbach’s alpha ranges from 0 to 1 with the higher values implying the items are measuring the same latent variable or dimension. On the contrary, if Cronbach’s alpha value is low (near 0), it means some or all of the items are not measuring the same dimension and so the questionnaire does not exhibit reliability or internal consistency [1011].
One of the main reasons for conducting a pilot study for survey research is to test the reliability of a questionnaire. Therefore, this study aims to provide useful recommendations on how to determine the minimum sample size to design a pilot study to assess the reliability of the questionnaire. Such recommendations will assist researchers to conduct proper sample size planning to prepare for a pilot survey.
SOFTWARE
A list of various sample size tables was compiled together which were based on minimum sample size requirements for performing a kappa agreement test, intra-class correlation test, and Cronbach’s alpha test. These statistical tests are selected for this compilation because they are most commonly applied in a wide variety of pilot studies. The minimum sample size requirement was estimated by using PASS 2022 Power Analysis and Sample Size Software (NCSS, LLC, Kaysville, UT, USA). The PASS software is a commercial software that provides a list of tools for determining sample size requirements for over 1,100 statistical tests and confidence interval scenarios. For all calculations, the significance level (alpha) and power are set at the values of 0.05 and 80.0% respectively.
Kappa agreement test
This calculation for performing a kappa agreement test was based on a formula introduced by Flack et al. [12]. Besides the significance level (alpha) and power (i.e., 1 – beta), there are 3 other parameters are also required for sample size calculation, namely: ‘K1’ refers to the value of the kappa coefficient in the null hypothesis, ‘K2’ refers to the value of kappa coefficient in the alternative hypothesis and ‘category’ refers to the number of categories for a particular variable. K1 is set to be equal to 0, K2 is set to range from 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7 and category is set to range from 2 by 2 up until 10 by 10. To simplify the calculation, it is assumed that the proportion in each category is proportional to each other (i.e., for a 2 by 2 table, the responses are assumed 0.25 in each category). Most scholars agreed that the minimum Cohen’s kappa coefficient should ideally reach at least 0.40 [131415].
Intra-class correlation test
The calculation was performed by using a formula introduced by a previous study [16]. Besides the significance level (i.e., alpha) and power (i.e., 1 – beta), there are 3 other parameters for the sample size calculation such as R0 refers to a pre-specified value of the intra-class coefficient in the null hypothesis, R1 refers to a pre-specified value of the intra-class coefficient in the alternative hypothesis and the number of total observations which are designated as 2 shall refer to the observations for performing the test-retest reliability. R0 is set at 0 and R1 shall range from 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7. Most scholars agree that the minimum value for the intra-class coefficient should ideally reach at least 0.50 [81718].
Cronbach’s alpha test
The calculation was done by using a formula introduced by a previous study [19]. Besides the significance level (i.e., alpha) and power (i.e., 1 – beta), there are 3 other parameters required for sample size calculation such as CA0 refers to the value of Cronbach’s alpha coefficient in the null hypothesis, CA1 refers to the value of Cronbach’s alpha coefficient in the alternative hypothesis and category (k) refers to the number of test items. CA0 is set at 0, CA1 is set to range from 0.5, 0.6, 0.7, and 0.8 and the number of test items shall range from 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, and 55. The minimum value of Cronbach’s alpha coefficient should ideally reach at least 0.60 [2021].
Ethical approval is not required since this is an article discussion from a methodology perspective. This article has been granted approval for publication by the Director General, Ministry of Health, Malaysia.
Findings
For Cohen’s kappa agreement test, the minimum sample size requirement shall range from 3 to 194 subjects which depends on the setting of a list of various conditions (i.e., total number of categories for a response variable). As a general recommendation, this study shall plan to estimate the ideal sample size which is based on the number of categories as 5 (i.e., to represent a 5 Likert scale). By setting the K1 as 0.0 and K2 as 0.4, the minimum required sample size is 15 (Table 1). For the intra-class correlation test, the minimum sample size requirement shall range from 10 to 152 subjects which depends on the setting of the conditions (i.e., values of R0 and R1). As a general recommendation, this study shall estimate the ideal sample size to be based on R0 as 0.0 and R1 as 0.5 which follows that the minimum required sample size is 22 (Table 2). For Cronbach’s alpha test, the minimum sample size required shall range from 7 to 68 subjects depending on the setting of the conditions (i.e., values of CA0, CA1, and number of test items). As a general recommendation, this study shall estimate the ideal sample size to be based on CA0 as 0.0, and CA1 as 0.6 along with a group of 5 test items which follows that the minimum required sample size is 24 (Table 3).
DISCUSSION
For survey research, a pilot study is usually conducted to assess the suitability of a study instrument or questionnaire for research purposes. This is very common for a survey that involves using a newly developed questionnaire or validating an existing validated questionnaire. This pilot study for a survey aims to assess whether or not the questionnaire can be correctly understood by the subjects, and also if the subjects are able to provide logical responses as valid and reliable feedback to each of the questions. Scientifically, it determines whether the questionnaire is feasible and also sufficiently reliable for use in a new research study. A poorly developed questionnaire for use in a survey may lead to a waste of resources because the information gathered from the study respondents will probably not be adequately valid and reliable.
Very few studies have discussed sample size requirements for conducting a pilot study that emphasizes scale development for a study instrument [2223]. However, the previous studies also did not discuss sample size requirements for conducting a pilot study for performing an agreement test which is the ideal statistical test to examine test-retest reliability. When assessing the validity and/or reliability of a questionnaire via a pilot study, a researcher will usually evaluate the test-retest reliability by performing the kappa agreement test or intra-class correlation test. As discussed earlier in the introduction section, correlation is not an appropriate statistical test to measure the test-retest reliability of a study instrument because the items are expressed on a categorical scale [8].
Besides setting the significance level (i.e., alpha) and power (i.e., 1 – beta) to be 0.05 and 80.0% respectively, a minimum required sample size shall also depend on certain pre-specified conditions and parameters set by researchers. For the kappa agreement test, intra-class correlation test, and Cronbach’s alpha test, the recommended minimum sample size requirement is calculated to be 15, 22, and 24 subjects respectively. Say, with the provision of an additional non-response rate of 20.0% and quoting the highest minimum sample size of 24, then the minimum required sample size shall be at least 24/0.8 = 30 subjects.
Here, this study proposes sample size statements based on a recommendation from a previous study [24]. Bujang in his paper recommends 5 steps (step 1: to understand the objective of the study, step 2: to decide the appropriate statistical analysis, step 3: to estimate or calculate the sample size, step 4: to provide additional allowance to cater for the possibility of non-response rate, step 5: to write a sample size statement) for sample size determination. The examples are as follows;
To conduct test-retest reliability using the kappa agreement test
This study aims to determine the test-retest reliability of questionnaire A. Since, the response variables are measured in categorical form and therefore, the kappa agreement test will be used for analysis. The response variables are in 5 categories representing a 5 Likert scale. For the calculation, type I error (alpha) was set at a maximum value of 0.05, and power was set at a minimum value of 80.0%. By setting the kappa coefficient in the null hypothesis (K1) as 0.0 and the kappa coefficient in the null hypothesis (K2) as 0.4, the minimum required sample size is 15. To incorporate a non-response rate of 20.0%, a minimum sample size of 19 respondents is required.
To conduct test-retest reliability using an intra-class correlation test
This study aims to determine the test-retest reliability of questionnaire A. Since, the response variables are measured in numerical form and therefore, an intra-class correlation test will be used for analysis. For the calculation, type I error (alpha) was set at a maximum value of 0.05, and power was set at a minimum value of 80.0%. By setting the intra-class coefficient in the null hypothesis (R0) as 0.0 and the intra-class coefficient in the alternative hypothesis (R1) as 0.5, the minimum required sample size is 22. To incorporate a non-response rate of 20.0%, a minimum sample size of 28 respondents is required.
To conduct test-retest reliability using Cronbach’s alpha test
This study aims to determine the internal consistency of domain A. For this purpose, Cronbach’s alpha test will be used for analysis. For the calculation, type I error (alpha) was set at a maximum value of 0.05, and power was set at a minimum value of 80.0%. By setting Cronbach’s alpha coefficient in the null hypothesis (CA0) as 0.0 and the alpha coefficient in the alternative hypothesis (CA1) as 0.6 along with a group of 5 items, the minimum required sample size is 24. To incorporate a non-response rate of 20.0%, a minimum sample size of 30 respondents is required.
In a nutshell, assessing the reliability of a questionnaire will usually require a smaller sample size of less than 30. This is in line with researchers’ expectations because they expect to recruit a small sample size for a pilot study so that a large portion of patients or respondents can be reserved for a real survey. A larger sample size is often necessary to test for the validity of the questionnaire by using either exploratory factor analysis for determining its construct validity or by performing a sensitivity and specificity analysis for a questionnaire that is developed for screening purposes [25262728]. Therefore, the researcher can only assess the reliability of a questionnaire in a pilot study whereas the validity of a questionnaire will have to be assessed by the actual full-scale surveys because the validation process will necessitate a much larger sample size [2930].
CONCLUSION
In conclusion, the determination of the minimum sample size requirement for a pilot study shall depend on the aim of the pilot study itself by giving careful consideration to all the statistical requirements. Hence, this paper provides a list of useful recommendations regarding the determination of minimum sample size requirement when designing a pilot study to assess the reliability of the questionnaire. Generally, a minimum sample size of at least 30 respondents shall usually be sufficient to assess the reliability of the questionnaire.
Notes
Conflict of Interest: No potential conflict of interest relevant to this article was reported.
Author Contributions:
Conceptualization: Bujang MA.
Formal analysis: Bujang MA.
Investigation: Omar ED, Hon YK.
Methodology: Bujang MA, Omar ED.
Project administration: Hon YK, Foo DHP.
Resources: Bujang MA.
Supervision: Bujang MA.
Validation: Omar ED, Foo DHP.
Writing - original draft: Bujang MA, Hon YK.
Writing - review & editing: Omar ED, Foo DHP.