Install and load pwrss R package:
install.packages("pwrss")
library(pwrss)
If you find the package and related material useful please cite as:
Bulus, M. (2023). pwrss: Statistical Power and Sample Size Calculation Tools. R package version 0.3.1. https://CRAN.R-project.org/package=pwrss
Bulus, M., & Polat, C. (in press). pwrss R paketi ile istatistiksel guc analizi [Statistical power analysis with pwrss R package]. Ahi Evran Universitesi Kirsehir Egitim Fakultesi Dergisi. https://osf.io/ua5fc/download/
Acknowledgements
I would like to thank Roland Thijs and Fred Oswald, who caught some typos in the earlier version of this vignette.
Please send any bug reports, feedback, or questions to bulusmetin [at] gmail.com
These generic functions calculate and return statistical power along
with optional Type I and Type II error plots (as long as the test
statistic and degrees of freedom is known). These functions are handy
because z, t, \(\chi^{2}\), and F statistics and
associated degrees of freedom can be readily found in reports, published
articles, or standard software output (such as SAS, Stata, SPSS, Jamovi,
etc.). Statistical power can be calculated via putting the test
statistics for ncp
in the functions below; however, do not
overinterpret post-hoc power rates.
power.t.test(ncp = 1.96, df = 99, alpha = 0.05,
alternative = "equivalent", plot = TRUE)
## power ncp.alt ncp.null.1 ncp.null.2 alpha df t.crit.1 t.crit.2
## 0.2371389 0 -1.96 1.96 0.05 99 -0.3155295 0.3155295
power.z.test(ncp = 1.96, alpha = 0.05,
alternative = "not equal", plot = TRUE)
## power ncp.alt ncp.null alpha z.crit.1 z.crit.2
## 0.5000586 1.96 0 0.05 -1.959964 1.959964
power.chisq.test(ncp = 15, df = 20,
alpha = 0.05, plot = TRUE)
## power ncp.alt ncp.null alpha df chisq.crit
## 0.6110368 15 0 0.05 20 31.41043
power.f.test(ncp = 3, df1 = 2, df2 = 98,
alpha = 0.05, plot = TRUE)
## power ncp.alt ncp.null alpha df1 df2 f.crit
## 0.3128778 3 0 0.05 2 98 3.089203
Multiple parameters are allowed but plots should be turned off
(plot = FALSE
).
power.t.test(ncp = c(0.50, 1.00, 1.50, 2.00, 2.50), plot = FALSE,
df = 99, alpha = 0.05, alternative = "not equal")
## power ncp.alt ncp.null alpha df t.crit.1 t.crit.2
## 0.07852973 0.5 0 0.05 99 -1.984217 1.984217
## 0.16769955 1.0 0 0.05 99 -1.984217 1.984217
## 0.31785490 1.5 0 0.05 99 -1.984217 1.984217
## 0.50826481 2.0 0 0.05 99 -1.984217 1.984217
## 0.69698027 2.5 0 0.05 99 -1.984217 1.984217
power.z.test(alpha = c(0.001, 0.010, 0.025, 0.050), plot = FALSE,
ncp = 1.96, alternative = "greater")
## power ncp.alt ncp.null alpha z.crit
## 0.1291892 1.96 0 0.001 3.090232
## 0.3570528 1.96 0 0.010 2.326348
## 0.5000144 1.96 0 0.025 1.959964
## 0.6236747 1.96 0 0.050 1.644854
power.chisq.test(df = c(80, 90, 100, 120, 150, 200), plot = FALSE,
ncp = 2, alpha = 0.05)
## power ncp.alt ncp.null alpha df chisq.crit
## 0.06989507 2 0 0.05 80 101.8795
## 0.06856779 2 0 0.05 90 113.1453
## 0.06746196 2 0 0.05 100 124.3421
## 0.06571411 2 0 0.05 120 146.5674
## 0.06382959 2 0 0.05 150 179.5806
## 0.06175379 2 0 0.05 200 233.9943
plot()
function (S3 method) is a wrapper around the
generic functions above. Assign results of any pwrss
function to an R object and pass it to plot()
function.
# comparing two means
design1 <- pwrss.t.2means(mu1 = 0.20, margin = -0.05, paired = TRUE,
power = 0.80, alpha = 0.05,
alternative = "non-inferior")
plot(design1)
# ANCOVA design
design2 <- pwrss.f.ancova(eta2 = 0.10, n.levels = c(2,3),
power = .80, alpha = 0.05)
plot(design2)
# indirect effect in mediation analysis
design3 <- pwrss.z.med(a = 0.10, b = 0.20, cp = 0.10,
power = .80, alpha = 0.05)
plot(design3)
More often than not, unstandardized means and standard deviations are reported in publications for descriptive purposes. Another reason is that they are more intuitive and interpretable (e.g. depression scale). Assume that for the first and second groups expected means are 30 and 28, and expected standard deviations are 12 and 8, respectively.
Calculate Statistical Power
What is the statistical power given that the sample size for the
second group is 50 (n2 = 50
) and groups have equal sample
sizes (kappa = n1 / n2 = 1
)?
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1,
n2 = 50, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.163
## n1 = 50
## n2 = 50
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 98
## Non-centrality parameter = 0.981
## Type I error rate = 0.05
## Type II error rate = 0.837
Calculate Minimum Required Sample Size
What is the minimum required sample size given that groups have equal
sample sizes (kappa = 1
)? (\(\kappa = n_1 / n_2\))
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 410
## n2 = 410
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 818
## Non-centrality parameter = 2.808
## Type I error rate = 0.05
## Type II error rate = 0.2
It is sufficient to put pooled standard deviation for
sd1
because sd2 = sd1
by default. In this
case, for a pooled standard deviation of 10.198 the minimum required
sample size can be calculated as
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.198, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 410
## n2 = 410
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 818
## Non-centrality parameter = 2.808
## Type I error rate = 0.05
## Type II error rate = 0.2
It is sufficient to put Cohen’s d or Hedge’s g
(standardized difference between two groups) for mu1
because mu2 = 0
, sd1 = 1
, and
sd2 = sd1
by default. For example, for an effect size as
small as 0.196 (based on previous example) the minimum required sample
size can be calculated as
pwrss.t.2means(mu1 = 0.196, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n1 = 410
## n2 = 410
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 818
## Non-centrality parameter = 2.806
## Type I error rate = 0.05
## Type II error rate = 0.2
Assume for the first (e.g. pretest) and second (e.g. posttest) time
points expected means are 30 and 28 (a reduction of 2 points), and
expected standard deviations are 12 and 8, respectively. Also assume a
correlation of 0.50 between first and second measurements (by default
paired.r = 0.50
). What is the minimum required sample
size?
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
paired = TRUE, paired.r = 0.50,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Paired Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n = 222
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 221
## Non-centrality parameter = 2.816
## Type I error rate = 0.05
## Type II error rate = 0.2
It is sufficient to put standard deviation of the difference for
sd1
because sd2 = sd1
by default. In this
case, for a standard deviation of difference of 10.583 the minimum
required sample size can be calculated as
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 10.583,
paired = TRUE, paired.r = 0.50,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Paired Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n = 222
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 221
## Non-centrality parameter = 2.816
## Type I error rate = 0.05
## Type II error rate = 0.2
It is sufficient to put Cohen’s d or Hedge’s g
(standardized difference between two time points) for mu1
because mu2 = 0
,
sd1 = sqrt(1/(2*(1-paired.r)))
, and sd2 = sd1
by default. For example, for an effect size as small as 0.1883 (based on
previous example) the minimum required sample size can be calculated
as
pwrss.t.2means(mu1 = 0.1883, paired = TRUE, paired.r = 0.50,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two means
## (Paired Samples t Test)
## H0: mu1 = mu2
## HA: mu1 != mu2
## ------------------------------
## Statistical power = 0.8
## n = 224
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 223
## Non-centrality parameter = 2.818
## Type I error rate = 0.05
## Type II error rate = 0.2
The variable of interest for which means are compared may not follow normal distribution. One reason is that the sample size could be small. Another reason is that the parent distribution (distribution in the population) may not be normal (e.g., could be uniform, exponential, double exponential, or logistic). In these cases t test could produce biased results.
For non-parametric tests use pwrss.np.2groups()
function
instead of pwrss.t.2means()
. All arguments are the same.
Additionally, however, the parent distribution can be specified as
“normal”, “uniform”, “exponential”, “double exponential”, or “logistic”.
In the following examples the parent distribution is “normal” (not shown
because it is the default) and can be changed with the dist
or distribution
argument
(e.g. dist = "uniform"
).
Although means and standard deviations are some of the arguments in the function below, what is actually being tested is the difference between mean ranks. First, the mean difference is converted into Cohen’s d, and then into probability of superiority, which is the probability of an observation in group 1 being higher than an observation in group 2. This parameterization, expressed as means and standard deviations, helps in making comparisons and switching back and forth between parametric and non-parametric tests.
The example below uses the same parameters as the example in the independent t test section.
pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8, kappa = 1,
power = .80, alpha = 0.05,
alternative = "not equal")
## Non-parametric Difference between Two Groups (Independent samples)
## Mann-Whitney U or Wilcoxon Rank-sum Test
## (a.k.a Wilcoxon-Mann-Whitney Test)
## Method: GUENTHER
## ------------------------------
## Statistical power = 0.8
## n1 = 429
## n2 = 429
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = 2.805
## Degrees of freedom = 816.21
## Type I error rate = 0.05
## Type II error rate = 0.2
The example below uses the same parameters as the example in the paired (dependent) t test section.
pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
paired = TRUE, paired.r = 0.50,
power = .80, alpha = 0.05,
alternative = "not equal")
## Non-parametric Difference between Two Groups (Dependent samples)
## Wilcoxon signed-rank Test for Matched Pairs
## Method: GUENTHER
## ------------------------------
## Statistical power = 0.8
## n = 233
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = 2.814
## Degrees of freedom = 220.7
## Type I error rate = 0.05
## Type II error rate = 0.2
It is sufficient to put Cohen’s d or Hedge’s g
(standardized difference between two groups or measurements) for
mu1
without specifying mu2
, sd1
,
and sd2
.
These tests are useful for testing practically significant difference (non-inferiority/superiority) or practically null difference (equivalence).
Non-inferiority: The mean of group 1 is practically
not smaller than the mean of group 2. The mu1 - mu2
difference can be as small as -1 (margin = -1
) but it will
still be considered non-inferior. What is the minimum required sample
size?
When higher values of an outcome is better the margin takes NEGATIVE values; whereas when lower values of the outcome is better margin takes POSITIVE values.
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
margin = -1, power = 0.80,
alternative = "non-inferior")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 - mu2 <= margin
## HA: mu1 - mu2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 144
## n2 = 144
## ------------------------------
## Alternative = "non-inferior"
## Degrees of freedom = 286
## Non-centrality parameter = 2.496
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority: The mean of group 1 is practically
greater than the mean of group 2. The mu1 - mu2
difference
is at least greater than 1 (margin = 1
). What is the
minimum required sample size?
When higher values of an outcome is better margin takes POSITIVE values; whereas when lower values of the outcome is better margin takes NEGATIVE values.
pwrss.t.2means(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
margin = 1, power = 0.80,
alternative = "superior")
## Difference between Two means
## (Independent Samples t Test)
## H0: mu1 - mu2 <= margin
## HA: mu1 - mu2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 1287
## n2 = 1287
## ------------------------------
## Alternative = "superior"
## Degrees of freedom = 2572
## Non-centrality parameter = 2.487
## Type I error rate = 0.05
## Type II error rate = 0.2
Equivalence: The mean of group 1 is practically same
as mean of group 2. The mu1 - mu2
difference can be as
small as -1 and as high as 1 (margin = 1
). What is the
minimum required sample size?
Specify the absolute value for the margin.
pwrss.t.2means(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8,
margin = 1, power = 0.80,
alternative = "equivalent")
## Difference between Two means
## (Independent Samples t Test)
## H0: |mu1 - mu2| >= margin
## HA: |mu1 - mu2| < margin
## ------------------------------
## Statistical power = 0.8
## n1 = 1783
## n2 = 1783
## ------------------------------
## Alternative = "equivalent"
## Degrees of freedom = 3564
## Non-centrality parameter = -2.928
## Type I error rate = 0.05
## Type II error rate = 0.2
Non-inferiority: The mean of group 1 is practically
not smaller than the mean of group 2. The mu1 - mu2
difference can be as small as -1 (margin = -1
). What is the
minimum required sample size?
pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
margin = -1, power = 0.80,
alternative = "non-inferior")
## Non-parametric Difference between Two Groups (Independent samples)
## Mann-Whitney U or Wilcoxon Rank-sum Test
## (a.k.a Wilcoxon-Mann-Whitney Test)
## Method: GUENTHER
## ------------------------------
## Statistical power = 0.8
## n1 = 151
## n2 = 151
## ------------------------------
## Alternative = "non-inferior"
## Non-centrality parameter = 2.492
## Degrees of freedom = 285.13
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority: The mean of group 1 is practically
greater than the mean of group 2. The mu1 - mu2
difference
is at least greater than 1 (margin = 1
). What is the
minimum required sample size?
pwrss.np.2groups(mu1 = 30, mu2 = 28, sd1 = 12, sd2 = 8,
margin = 1, power = 0.80,
alternative = "superior")
## Non-parametric Difference between Two Groups (Independent samples)
## Mann-Whitney U or Wilcoxon Rank-sum Test
## (a.k.a Wilcoxon-Mann-Whitney Test)
## Method: GUENTHER
## ------------------------------
## Statistical power = 0.8
## n1 = 1348
## n2 = 1348
## ------------------------------
## Alternative = "superior"
## Non-centrality parameter = 2.487
## Degrees of freedom = 2571.3
## Type I error rate = 0.05
## Type II error rate = 0.2
Equivalence: The mean of group 1 is practically same
as mean of group 2. The mu1 - mu2
difference can be as
small as -1 and as high as 1 (margin = 1
). What is the
minimum required sample size?
pwrss.np.2groups(mu1 = 30, mu2 = 30, sd1 = 12, sd2 = 8,
margin = 1, power = 0.80,
alternative = "equivalent")
## Non-parametric Difference between Two Groups (Independent samples)
## Mann-Whitney U or Wilcoxon Rank-sum Test
## (a.k.a Wilcoxon-Mann-Whitney Test)
## Method: GUENTHER
## ------------------------------
## Statistical power = 0.8
## n1 = 1867
## n2 = 1867
## ------------------------------
## Alternative = "equivalent"
## Non-centrality parameter = -2.927
## Degrees of freedom = 3561.91
## Type I error rate = 0.05
## Type II error rate = 0.2
Omnibus F test in multiple liner regression is used to test whether \(R^2\) is greater than 0 (zero). Assume that we want to predict a continuous variable \(Y\) using \(X_{1}\), \(X_{2}\), and \(X_{2}\) variables (a combination of binary or continuous).
\[\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]
We are expecting that these three variables explain 30% of the
variance in the outcome (\(R^2 = 0.30\)
or r2 = 0.30
in the code). What is the minimum required
sample size?
pwrss.f.reg(r2 = 0.30, k = 3, power = 0.80, alpha = 0.05)
## Linear Regression (F test)
## R-squared Deviation from 0 (zero)
## H0: r2 = 0
## HA: r2 > 0
## ------------------------------
## Statistical power = 0.8
## n = 30
## ------------------------------
## Numerator degrees of freedom = 3
## Denominator degrees of freedom = 25.653
## Non-centrality parameter = 12.709
## Type I error rate = 0.05
## Type II error rate = 0.2
Assume that we want to add two more predictors to the earlier
regression model (\(X_{4}\), and \(X_{5}\); thus m = 2
) and test
whether increase in \(R^2\) is greater
than 0 (zero). In total there are five predictors in the model
(k = 5
). By adding two predictors we are expecting an
increase of \(\Delta R^2 = 0.15\). What
is the minimum required sample size?
\[\begin{eqnarray} Y &=& \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \beta_{3}X_{3} + \beta_{4}X_{4} + \beta_{5}X_{5} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]
pwrss.f.reg(r2 = 0.15, k = 5, m = 2, power = 0.80, alpha = 0.05)
## Hierarchical Linear Regression (F test)
## R-squared Change
## H0: r2 = 0
## HA: r2 > 0
## ------------------------------
## Statistical power = 0.8
## n = 58
## ------------------------------
## Numerator degrees of freedom = 2
## Denominator degrees of freedom = 51.876
## Non-centrality parameter = 10.213
## Type I error rate = 0.05
## Type II error rate = 0.2
In the earlier example, assume that we want to predict a continuous variable \(Y\) using a continuous predictor \(X_{1}\) but control for \(X_{2}\), and \(X_{2}\) variables (a combination of binary or continuous). We are mainly interested in the effect of \(X_{1}\) and expect a standardized regression coefficient of \(\beta_{1} = 0.20\).
\[\begin{eqnarray} Y &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} + r, \quad r \thicksim N(0,\sigma^2) \newline \end{eqnarray}\]
Again, we are expecting that these three variables explain 30% of the
variance in the outcome (\(R^2 =
0.30\)). What is the minimum required sample size? It is
sufficient to provide standardized regression coefficient for
beta1
because sdx = 1
and sdy = 1
by default.
pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30,
power = .80, alpha = 0.05, alternative = "not equal")
## Linear Regression Coefficient (t Test)
## H0: beta1 = beta0
## HA: beta1 != beta0
## ------------------------------
## Statistical power = 0.8
## n = 140
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 135.331
## Non-centrality parameter = 2.822
## Type I error rate = 0.05
## Type II error rate = 0.2
For unstandardized coefficients specify sdy
and
sdx
. Assume we are expecting an unstandardized regression
coefficient of beta1 = 0.60
, a standard deviation of
sdy = 12
for the outcome and a standard deviation of
sdx = 4
for the main predictor. What is the minimum
required sample size?
pwrss.t.reg(beta1 = 0.60, sdy = 12, sdx = 4, k = 3, r2 = 0.30,
power = .80, alpha = 0.05, alternative = "not equal")
## Linear Regression Coefficient (t Test)
## H0: beta1 = beta0
## HA: beta1 != beta0
## ------------------------------
## Statistical power = 0.8
## n = 140
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 135.331
## Non-centrality parameter = 2.822
## Type I error rate = 0.05
## Type II error rate = 0.2
If the main predictor is binary (e.g. treatment/control), the
standardized regression coefficient is Cohen’s d. Standard
deviation of the main predictor is \(\sqrt{p(1-p)}\) where p is the
proportion of sample in one of the groups. Assume half of the sample is
in the first group \(p = 0.50\). What
is the minimum required sample size? It is sufficient to provide Cohen’s
d for beta1
(standardized difference between two
groups) but specify sdx = sqrt(p*(1-p))
where p is
the proportion of subjects in one of the groups.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
power = .80, alpha = 0.05, alternative = "not equal")
## Linear Regression Coefficient (t Test)
## H0: beta1 = beta0
## HA: beta1 != beta0
## ------------------------------
## Statistical power = 0.8
## n = 552
## ------------------------------
## Alternative = "not equal"
## Degrees of freedom = 547.355
## Non-centrality parameter = 2.807
## Type I error rate = 0.05
## Type II error rate = 0.2
These tests are useful for testing practically significant effects (non-inferiority/superiority) or practically null effects (equivalence).
Non-inferiority: The intervention is expected to be
non-inferior to some earlier or other interventions. Assume that the
effect of an earlier or some other intervention is
beta0 = 0.10
. The beta1 - beta0
is expected to
be positive and should be at least -0.05 (margin = -0.05
).
What is the minimum required sample size?
This is the case when higher values of an outcome is better. When
lower values of an outcome is better the beta1 - beta0
difference is expected to be NEGATIVE and the margin
takes
POSITIVE values.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = -0.05,
k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
power = .80, alpha = 0.05, alternative = "non-inferior")
## Linear Regression Coefficient (t Test)
## H0: beta1 - beta0 <= margin
## HA: beta1 - beta0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 771
## ------------------------------
## Alternative = "non-inferior"
## Degrees of freedom = 766.745
## Non-centrality parameter = 2.489
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority: The intervention is expected to be
superior to some earlier or other interventions. Assume that the effect
of an earlier or some other intervention is beta0 = 0.10
.
The beta1 - beta0
is expected to be positive and should be
at least 0.05 (margin = 0.05
). What is the minimum required
sample size?
This is the case when higher values of an outcome is better. When
lower values of an outcome is better beta1 - beta0
difference is expected to be NEGATIVE and the margin
takes
NEGATIVE values.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.10, margin = 0.05,
k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
power = .80, alpha = 0.05, alternative = "superior")
## Linear Regression Coefficient (t Test)
## H0: beta1 - beta0 <= margin
## HA: beta1 - beta0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 6926
## ------------------------------
## Alternative = "superior"
## Degrees of freedom = 6921.818
## Non-centrality parameter = 2.487
## Type I error rate = 0.05
## Type II error rate = 0.2
Equivalence: The intervention is expected to be
equivalent to some earlier or other interventions. Assume the effect of
an earlier or some other intervention is beta0 = 0.20
. The
beta1 - beta0
is expected to be within -0.05 and 0.05
(margin = 0.05
). What is the minimum required sample
size?
margin
always takes positive values for equivalence.
Specify the absolute value.
p <- 0.50
pwrss.t.reg(beta1 = 0.20, beta0 = 0.20, margin = 0.05,
k = 3, r2 = 0.30, sdx = sqrt(p*(1-p)),
power = .80, alpha = 0.05, alternative = "equivalent")
## Linear Regression Coefficient (t Test)
## H0: |beta1 - beta0| >= margin
## HA: |beta1 - beta0| < margin
## ------------------------------
## Statistical power = 0.8
## n = 9593
## ------------------------------
## Alternative = "equivalent"
## Degrees of freedom = 9588.862
## Non-centrality parameter = -2.927 2.927
## Type I error rate = 0.05
## Type II error rate = 0.2
In logistic regression a binary outcome variable (0/1: failed/passed, dead/alive, absent/present) is modeled by predicting probability of being in group 1 (\(P_1\)) via logit transformation (natural logarithm of odds). The base probability \(P_0\) is the overall probability of being in group 1 without influence of predictors in the model (null). Under alternative hypothesis, the probability of being in group 1 (\(P_1\)) deviate from \(P_0\) depending on the value of the predictor; whereas under null it is same as the \(P_0\). A model with one main predictor (\(X_1\)) and two other covariates (\(X_2\) and \(X_3\)) can be constructed as
\[\begin{eqnarray} ln(\frac{P_1}{1- P_1}) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}\]
where
\[\beta_0 = ln(\frac{P_0}{1-
P_0})\]
\[\beta_1 =
ln(\frac{P_1}{1- P_1} / \frac{P_0}{1- P_0})\]
Odds ratio is
defined as \[OR = exp(\beta_1) =
\frac{P_1}{1- P_1} / \frac{P_0}{1- P_0}\]
Example:
Assume
r2.other.x = 0.20
in the code). It can be found in the
form of adjusted R-square via regressing \(X_1\) on \(X_2\) and \(X_3\). Higher values require larger sample
sizes. The default is 0 (zero).What is the minimum required sample size? There are three types of specification to statistical power or sample size calculations; (i) probability specification, (ii) odds ratio specification, and (iii) regression coefficient specification (as in standard software output).
Probability specification:
pwrss.z.logreg(p0 = 0.15, p1 = 0.10, r2.other.x = 0.20,
power = 0.80, alpha = 0.05,
dist = "normal")
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 365
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.766
## Type I error rate = 0.05
## Type II error rate = 0.2
Odds ratio specification: \[OR = \frac{P_1}{1-P1} / \frac{P_0}{1-P_0} = \frac{0.10}{1-0.10} / \frac{0.15}{1-0.15} = 0.6296\]
pwrss.z.logreg(p0 = 0.15, odds.ratio = 0.6296, r2.other.x = 0.20,
alpha = 0.05, power = 0.80,
dist = "normal")
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 365
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.766
## Type I error rate = 0.05
## Type II error rate = 0.2
Regression coefficient specification: \[\beta_1 = ln(\frac{P_1}{1-P1} / \frac{P_0}{1-P_0}) = ln(0.6296) = -0.4626\]
pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
alpha = 0.05, power = 0.80,
dist = "normal")
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 365
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.766
## Type I error rate = 0.05
## Type II error rate = 0.2
Change the distribution’s parameters for predictor X:
The mean and standard deviation of a normally distributed main predictor is 0 and 1 by default. They can be modified. In the following example the mean is 25 and the standard deviation is 8.
dist.x <- list(dist = "normal", mean = 25, sd = 8)
pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
alpha = 0.05, power = 0.80,
dist = dist.x)
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 2435
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.423
## Type I error rate = 0.05
## Type II error rate = 0.2
Change the distribution family of predictor X:
More distribution types are supported by the function. For example,
the main predictor can be binary (e.g. treatment/control groups). Often
half of the sample is assigned to the treatment group and the other half
to the control (prob = 0.50
by default).
pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
alpha = 0.05, power = 0.80,
dist = "bernoulli")
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'bernoulli'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 1723
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.789
## Type I error rate = 0.05
## Type II error rate = 0.2
Change the treatment group allocation rate of the binary
predictor X (prob = 0.40
):
The per-subject cost in a treatment group is sometimes substantially higher than the per-subject cost in control group. At other times, it is difficult to recruit subjects for a treatment whereas there are plenty of subjects to be considered for the control group (e.g. a long wait-list). In such cases there is some leeway to pick an unbalanced sample (but not too much unbalanced). Assume the treatment group allocation rate is 40%. What is the minimum required sample size?
dist.x <- list(dist = "bernoulli", prob = 0.40)
pwrss.z.logreg(p0 = 0.15, beta1 = -0.4626, r2.other.x = 0.20,
alpha = 0.05, power = 0.80,
dist = dist.x)
## Logistic Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'bernoulli'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 1826
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.766
## Type I error rate = 0.05
## Type II error rate = 0.2
In Poisson regression a count outcome variable (e.g. number of hospital/store/website visits, number of absence/dead/purchase in a day/week/month) is modeled by predicting incidence rate (\(\lambda\)) via logarithmic transformation (natural logarithm of rates). A model with one main predictor (\(X_1\)) and two other covariates (\(X_2\) and \(X_3\)) can be constructed as
\[\begin{eqnarray} ln(\lambda) &=& \beta_{0} + \color{red} {\beta_{1} X_{1}} + \beta_{2}X_{2} + \beta_{3}X_{3} \newline \end{eqnarray}\]
where \(exp(\beta_0)\) is the base
incidence rate.
\[\beta_1 =
ln(\frac{\lambda(X_1=1)}{\lambda(X_1=0)})\]
Incidence rate
ratio is defined as \[exp(\beta_1) =
\frac{\lambda(X_1=1)}{\lambda(X_1=0)}\]
Assume
What is the minimum required sample size? There are two types of specification; (i) rate ratio specification (exponentiated regression coefficient), and (ii) raw regression coefficient specification (as in standard software output).
Regression coefficient specification:
pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
power = 0.80, alpha = 0.05,
dist = "normal")
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 474
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
Rate ratio specification:
pwrss.z.poisreg(exp.beta0 = exp(0.50),
exp.beta1 = exp(-0.10),
power = 0.80, alpha = 0.05,
dist = "normal")
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 474
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
Change the distribution’s parameters for predictor X:
dist.x <- list(dist = "normal", mean = 25, sd = 8)
pwrss.z.poisreg(exp.beta0 = exp(0.50),
exp.beta1 = exp(-0.10),
alpha = 0.05, power = 0.80,
dist = dist.x)
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'normal'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 66
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
The function accomodates other types of distribution. For example, the main predictor can be binary (e.g. treatment/control groups).
Regression coefficient specification:
pwrss.z.poisreg(beta0 = 0.50, beta1 = -0.10,
alpha = 0.05, power = 0.80,
dist = "bernoulli")
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'bernoulli'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 2003
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.801
## Type I error rate = 0.05
## Type II error rate = 0.2
Rate ratio specification:
pwrss.z.poisreg(exp.beta0 = exp(0.50),
exp.beta1 = exp(-0.10),
alpha = 0.05, power = 0.80,
dist = "bernoulli")
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'bernoulli'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 2003
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.801
## Type I error rate = 0.05
## Type II error rate = 0.2
Change treatment group allocation rate of the binary
predictor X (prob = 0.40
):
dist.x <- list(dist = "bernoulli", prob = 0.40)
pwrss.z.poisreg(exp.beta0 = exp(0.50),
exp.beta1 = exp(-0.10),
alpha = 0.05, power = 0.80,
dist = dist.x)
## Poisson Regression Coefficient
## (Large Sample Approx. Wald's z Test)
## H0: beta1 = 0
## HA: beta1 != 0
## Distribution of X = 'bernoulli'
## Method = DEMIDENKO(VC)
## ------------------------------
## Statistical power = 0.8
## n = 2095
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.792
## Type I error rate = 0.05
## Type II error rate = 0.2
A simple mediation model can be constructed as in the figure.
Regression models take the form of
\[\begin{eqnarray} M &=& \beta_{0M} + \color{red} {a} X + e\newline Y &=& \beta_{0Y} + \color{red} {b} M + \color{red} {cp} X + \epsilon \newline \end{eqnarray}\]
Y is the outcome, M is the mediator, and X
is the main predictor. The indirect effect is the product of a
and b path coefficients. cp is the path coefficient
for the direct effect. Path coefficients can be standardized or
unstandardized. They are presumed to be standardized by default because
the main predictor, mediator, and outcome all have standard deviations
of sdx = 1
, sdm = 1
, and sdy = 1
in the function. Standard deviations should be specified for
unstandardized path coefficients (you can find these values in
descriptive tables in reports/publications).
X is continuous:
Most software applications presume no covariates in the mediator and outcome models (\(R^2_M = 0\) and \(R^2_M = 0\)). Even with no covariates, X explain some of the variance in M (\(R^2_M > 0\)) and M & X explains some of the variance in Y (\(R^2_Y > 0\)). We will almost never have an R-squared value of 0 (zero). The explained variance in the basic mediation model (the base R-squared values) can be non-trivial and is taken into account in the function by default. Thus, results may seem different from other software outputs.
# mediation model with base R-squared values
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
power = 0.80, alpha = 0.05)
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.8 242 2.802
## Aroian 0.8 250 2.802
## Goodman 0.8 235 2.802
## Joint NA NA NA
## Monte Carlo NA NA NA
## ----------------------------
## Type I error rate = 0.05
To match results with other software packages, explicitly specify
these parameters as 0 (zero) (r2m = 0
and
r2y = 0
). There will be some warnings indicating that the
function expects R-squared values greater than the base R-squared value.
Note that in this case cp
argument will be ignored.
# base R-squared values are 0 (zero)
# do not specify 'cp'
pwrss.z.med(a = 0.25, b = 0.25,
r2m = 0, r2y = 0,
power = 0.80, alpha = 0.05)
## Warning: specified 'r2m.x' is smaller than the base 'r2m.x'
## Warning: specified 'r2y.mx' is smaller than the base 'r2y.mx'
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.8 252 2.802
## Aroian 0.8 259 2.802
## Goodman 0.8 243 2.802
## Joint NA NA NA
## Monte Carlo NA NA NA
## ----------------------------
## Type I error rate = 0.05
X is binary:
The case where the main predictor is binary can be useful in practice. A researcher might be interested in whether treatment influence the outcome through some mediators.
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
sdx = sqrt(p*(1-p)),
power = 0.80, alpha = 0.05)
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.8 614 2.802
## Aroian 0.8 626 2.802
## Goodman 0.8 602 2.802
## Joint NA NA NA
## Monte Carlo NA NA NA
## ----------------------------
## Type I error rate = 0.05
Statistical power for joint and Monte Carlo interval tests
Joint and Monte Carlo tests are only available when power is requested.
# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
sdx = sqrt(p*(1-p)),
n = 300, alpha = 0.05)
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.500 300 1.959
## Aroian 0.484 300 1.920
## Goodman 0.516 300 2.000
## Joint 0.584 300 NA
## Monte Carlo 0.565 300 NA
## ----------------------------
## Type I error rate = 0.05
Explanatory Power of Covariates
Covariates can be added to the mediator model, outcome model, or
both. Explanatory power of the covariates (R-squared values) for the
mediator and outcome models can be specified via r2m.x
and
r2y.mx
arguments.
# continuous X
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
r2m = 0.50, r2y = 0.50,
power = 0.80, alpha = 0.05)
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.8 189 2.802
## Aroian 0.8 194 2.802
## Goodman 0.8 183 2.802
## Joint NA NA NA
## Monte Carlo NA NA NA
## ----------------------------
## Type I error rate = 0.05
In experimental design subjects are randomly assigned to treatment and control groups. The allocation usually takes place with a rate of 50% (or probability of 0.50) meaning that half of the sample is assigned to treatment and the other half to control. Thus, the mediator model is less likely to have confounder. It is more common to add covariates to the outcome model only.
# binary X
p <- 0.50 # proportion of subjects in one of the groups
pwrss.z.med(a = 0.25, b = 0.25, cp = 0.10,
sdx = sqrt(p*(1-p)), r2y = 0.50,
power = 0.80, alpha = 0.05)
## Indirect Effect in Mediation Model
## H0: a * b = 0
## HA: a * b != 0
## ---------------------------
## test power n ncp
## Sobel 0.8 559 2.802
## Aroian 0.8 566 2.802
## Goodman 0.8 551 2.802
## Joint NA NA NA
## Monte Carlo NA NA NA
## ----------------------------
## Type I error rate = 0.05
ANOVA: Analysis of Variance
ANCOVA: Analysis of Covariance
In a one-way ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to a continuous variable. In a one-way ANCOVA, they may want to adjust means for covariates. Furthermore, they may want to inspect interaction between two or three factors (two-way or three-way ANOVA), or adjust interaction for covariates (two-way or three-way ANCOVA). The following examples illustrate how to determine minimum required sample size for such designs.
A researcher is expecting a difference of Cohen’s d = 0.50
between treatment and control groups (two levels) translating into \(\eta^2 = 0.059\)
(eta2 = 0.059
). Means are not adjusted for any covariates.
What is the minimum required sample size?
pwrss.f.ancova(eta2 = 0.059, n.levels = 2,
power = .80, alpha = 0.05)
## One-way Analysis of Variance (ANOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 128 7.971 1 125.132
## --------------------------------------
## Type I error rate: 0.05
A researcher is expecting an adjusted difference of Cohen’s
d = 0.45 between treatment and control groups
(n.levels = 2
) after controlling for the pretest
(n.cov = 1
) translating into partial \(\eta^2 = 0.048\)
(eta2 = 0.048
). What is the minimum required sample
size?
pwrss.f.ancova(eta2 = 0.048, n.levels = 2, n.cov = 1,
alpha = 0.05, power = .80)
## One-way Analysis of Covariance (ANCOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 158 7.948 1 154.626
## --------------------------------------
## Type I error rate: 0.05
n.cov
(or n.covariates
) argument has
trivial effect on the results. The difference between ANOVA and ANCOVA
procedure depends on whether the effect (eta2
) is
unadjusted or covariate-adjusted.
A researcher is expecting a partial \(\eta^2 = 0.03\) (eta2 = 0.03
)
for interaction of treatment/control (Factor A: two levels) with gender
(Factor B: two levels). Thus, n.levels = c(2,2)
. What is
the minimum required sample size?
pwrss.f.ancova(eta2 = 0.03, n.levels = c(2,2),
alpha = 0.05, power = 0.80)
## Two-way Analysis of Variance (ANOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## Factor B: 2 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 256 7.909 1 251.73
## B 0.8 256 7.909 1 251.73
## A x B 0.8 256 7.909 1 251.73
## --------------------------------------
## Type I error rate: 0.05
A researcher is expecting a partial \(\eta^2 = 0.02\) (eta2 = 0.02
)
for interaction of treatment/control (Factor A) with gender (Factor B)
adjusted for the pretest (n.cov = 1
). What is the minimum
required sample size?
pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2), n.cov = 1,
alpha = 0.05, power = .80)
## Two-way Analysis of Covariance (ANCOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## Factor B: 2 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 387 7.889 1 381.539
## B 0.8 387 7.889 1 381.539
## A x B 0.8 387 7.889 1 381.539
## --------------------------------------
## Type I error rate: 0.05
A researcher is expecting a partial \(\eta^2 = 0.02\) (eta2 = 0.02
)
for interaction of treatment/control (Factor A: two levels), gender
(Factor B: two levels), and socio-economic status (Factor C: three
levels). Thus, n.levels = c(2,2,3)
. What is the minimum
required sample size?
pwrss.f.ancova(eta2 = 0.02, n.levels = c(2,2,3),
alpha = 0.05, power = 0.80)
## Three-way Analysis of Variance (ANOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## Factor B: 2 levels
## Factor C: 3 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 387 7.889 1 374.576
## B 0.8 387 7.889 1 374.576
## C 0.8 476 9.697 2 463.167
## A x B 0.8 387 7.889 1 374.576
## A x C 0.8 476 9.697 2 463.167
## B x C 0.8 476 9.697 2 463.167
## A x B x C 0.8 476 9.697 2 463.167
## --------------------------------------
## Type I error rate: 0.05
A researcher is expecting a partial \(\eta^2 = 0.01\) (eta2 = 0.01
)
for interaction of treatment/control (Factor A), gender (Factor B), and
socio-economic status (Factor C: three levels) adjusted for the pretest
(n.cov = 1
). What is the minimum required sample size?
pwrss.f.ancova(eta2 = 0.01, n.levels = c(2,2,3), n.cov = 1,
alpha = 0.05, power = .80)
## Three-way Analysis of Covariance (ANCOVA)
## H0: 'eta2' or 'f2' = 0
## HA: 'eta2' or 'f2' > 0
## --------------------------------------
## Factor A: 2 levels
## Factor B: 2 levels
## Factor C: 3 levels
## --------------------------------------
## effect power n.total ncp df1 df2
## A 0.8 779 7.869 1 765.990
## B 0.8 779 7.869 1 765.990
## C 0.8 957 9.665 2 943.868
## A x B 0.8 779 7.869 1 765.990
## A x C 0.8 957 9.665 2 943.868
## B x C 0.8 957 9.665 2 943.868
## A x B x C 0.8 957 9.665 2 943.868
## --------------------------------------
## Type I error rate: 0.05
The group (between) effect: In a one-way repeated measures ANOVA, a researcher might be interested in comparing the means of several groups (levels of a factor) with respect to an outcome variable (continuous) after controlling for the effect of time. The time effect can be thought as the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.
The time (within) effect: A researcher might be interested in comparing the means of several time points with respect to the outcome variable after controlling for the group effect. In other words, the change across time points is not due to the group membership but due to the improvement/grow/loss/deterioration in the outcome variable that happens naturally or due to unknown causes.
The group x time interaction: A researcher might suspect that means across groups and means across time points are not independent from each other. The amount of improvement/grow/loss/deterioration in the outcome variable depends the group membership.
Example 1: Posttest only design with treatment and control groups.
A researcher is expecting a difference of Cohen’s d = 0.50
on the posttest score between treatment and control groups
(n.levels = 2
), translating into \(\eta^2 = 0.059\)
(eta2 = 0.059
). The test is administered at a single time
point; thus, the ‘number of repeated measures’ is n.rm = 1
.
What is the minimum required sample size?
pwrss.f.rmanova(eta2 = 0.059, n.levels = 2, n.rm = 1,
power = 0.80, alpha = 0.05,
type = "between")
## One-way Repeated Measures
## Analysis of Variance (F test)
## H0: eta2 = 0 (or f2 = 0)
## HA: eta2 > 0 (or f2 > 0)
## ------------------------------
## Number of levels (groups) = 2
## Number of repeated measurements = 1
## ------------------------------
## Statistical power = 0.8
## Total n = 128
## ------------------------------
## Type of the effect = "between"
## Numerator degrees of freedom = 1
## Denominator degrees of freedom = 125.132
## Non-centrality parameter = 7.971
## Type I error rate = 0.05
## Type II error rate = 0.2
Example 2: Pretest-posttest design with treatment group only.
A researcher is expecting a difference of Cohen’s d = 0.30
between posttest and pretest scores, translating into \(\eta^2 = 0.022\)
(eta2 = 0.022
). The test is administered before and after
the treatment; thus, the ‘number of repeated measures’ is
n.rm = 2
. There is treatment group but no control group
(n.levels = 1
). The researcher also expects a correlation
of 0.50 (corr.rm = 0.50
) between pretest and posttest
scores. What is the minimum required sample size?
pwrss.f.rmanova(eta2 = 0.022, n.levels = 1, n.rm = 2,
power = 0.80, alpha = 0.05,
corr.rm = 0.50, type = "within")
## One-way Repeated Measures
## Analysis of Variance (F test)
## H0: eta2 = 0 (or f2 = 0)
## HA: eta2 > 0 (or f2 > 0)
## ------------------------------
## Number of levels (groups) = 1
## Number of repeated measurements = 2
## ------------------------------
## Statistical power = 0.8
## Total n = 90
## ------------------------------
## Type of the effect = "within"
## Numerator degrees of freedom = 1
## Denominator degrees of freedom = 88.169
## Non-centrality parameter = 8.023
## Type I error rate = 0.05
## Type II error rate = 0.2
Example 3: Pretest-posttest control-group design.
A researcher is expecting a difference of Cohen’s d = 0.40
on the posttest scores between treatment and control groups
(n.levels = 2
) after controlling for pretest, translating
into partial \(\eta^2 = 0.038\)
(eta2 = 0.038
). The test is administered before and after
the treatment; thus, the ‘number of repeated measures’ is
n.rm = 2
. The researcher also expects a correlation of 0.50
(corr.rm = 0.50
) between pretest and posttest scores. What
is the minimum required sample size?
pwrss.f.rmanova(eta2 = 0.038, n.levels = 2, n.rm = 2,
power = 0.80, alpha = 0.05,
corr.rm = 0.50, type = "between")
## One-way Repeated Measures
## Analysis of Variance (F test)
## H0: eta2 = 0 (or f2 = 0)
## HA: eta2 > 0 (or f2 > 0)
## ------------------------------
## Number of levels (groups) = 2
## Number of repeated measurements = 2
## ------------------------------
## Statistical power = 0.8
## Total n = 151
## ------------------------------
## Type of the effect = "between"
## Numerator degrees of freedom = 1
## Denominator degrees of freedom = 148.97
## Non-centrality parameter = 7.951
## Type I error rate = 0.05
## Type II error rate = 0.2
A researcher is expecting a difference of Cohen’s d = 0.30
between posttest and pretest scores (n.rm = 2
) after
controlling for group membership, translating into partial \(\eta^2 = 0.022\)
(eta2 = 0.022
). There is both treatment and control groups
(n.levels = 2
). The researcher also expects a correlation
of 0.50 (corr.rm = 0.50
) between pretest and posttest
scores. What is the minimum required sample size?
pwrss.f.rmanova(eta2 = 0.022, n.levels = 2, n.rm = 2,
power = 0.80, alpha = 0.05,
corr.rm = 0.50, type = "within")
## One-way Repeated Measures
## Analysis of Variance (F test)
## H0: eta2 = 0 (or f2 = 0)
## HA: eta2 > 0 (or f2 > 0)
## ------------------------------
## Number of levels (groups) = 2
## Number of repeated measurements = 2
## ------------------------------
## Statistical power = 0.8
## Total n = 90
## ------------------------------
## Type of the effect = "within"
## Numerator degrees of freedom = 1
## Denominator degrees of freedom = 87.191
## Non-centrality parameter = 8.025
## Type I error rate = 0.05
## Type II error rate = 0.2
The rationale for inspecting the interaction is that the benefit of
the treatment may depend on the pretest score (e.g. those with higher
scores on the pretest improve or deteriorate more). A researcher is
expecting an interaction effect of partial \(\eta^2 = 0.01\) (eta2 = 0.01
).
The test is administered before and after the treatment; thus, the
‘number of repeated measures’ is n.rm = 2
. There is both
treatment and control groups (n.levels = 2
). The researcher
also expects a correlation of 0.50 (corr.rm = 0.50
) between
pretest and posttest scores. What is the minimum required sample
size?
pwrss.f.rmanova(eta2 = 0.01, n.levels = 2, n.rm = 2,
power = 0.80, alpha = 0.05,
corr.rm = 0.50, type = "interaction")
## One-way Repeated Measures
## Analysis of Variance (F test)
## H0: eta2 = 0 (or f2 = 0)
## HA: eta2 > 0 (or f2 > 0)
## ------------------------------
## Number of levels (groups) = 2
## Number of repeated measurements = 2
## ------------------------------
## Statistical power = 0.8
## Total n = 197
## ------------------------------
## Type of the effect = "interaction"
## Numerator degrees of freedom = 1
## Denominator degrees of freedom = 194.199
## Non-centrality parameter = 7.927
## Type I error rate = 0.05
## Type II error rate = 0.2
Effect size: Cohen’s W for goodness-of-fit test (1 x k or k x 1 table).
How many subjects are needed to claim that girls choose STEM related majors less than males? Check the original article at https://www.aauw.org/resources/research/the-stem-gap/
Option 1: Use cell probabilities
Alternative hypothesis state
that 28% of the workforce in STEM field is women whereas 72% is men
(from the article).
The null hypothesis assume 50% is women and 50% is men.
prob.mat <- c(0.28, 0.72)
pwrss.chisq.gofit(p1 = c(0.28, 0.72),
p0 = c(0.50, 0.50),
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 41
## ------------------------------
## Degrees of freedom = 1
## Non-centrality parameter = 7.849
## Type I error rate = 0.05
## Type II error rate = 0.2
Option 2: Use Cohen’s W = 0.44.
Degrees of freedom is k
- 1 for Cohen’s W.
pwrss.chisq.gofit(w = 0.44, df = 1,
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 41
## ------------------------------
## Degrees of freedom = 1
## Non-centrality parameter = 7.849
## Type I error rate = 0.05
## Type II error rate = 0.2
Effect size: Phi Coefficient (or Cramer’s V or Cohen’s W) for independence test (2 x 2 contingency table).
How many subjects are needed to claim that girls are underdiagnosed
with ADHD?
Check the original article at https://time.com/growing-up-with-adhd/
Option 1: Use cell probabilities.
5.6% of girls and 13.2% of
boys are diagnosed with ADHD (from the article).
prob.mat <- rbind(c(0.056, 0.132),
c(0.944, 0.868))
colnames(prob.mat) <- c("Girl", "Boy")
rownames(prob.mat) <- c("ADHD", "No ADHD")
prob.mat
## Girl Boy
## ADHD 0.056 0.132
## No ADHD 0.944 0.868
pwrss.chisq.gofit(p1 = prob.mat,
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 463
## ------------------------------
## Degrees of freedom = 1
## Non-centrality parameter = 7.849
## Type I error rate = 0.05
## Type II error rate = 0.2
Option 2: Use Phi coefficient = 0.1302134.
Degrees of freedom is
1 for Phi coefficient.
pwrss.chisq.gofit(w = 0.1302134, df = 1,
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 463
## ------------------------------
## Degrees of freedom = 1
## Non-centrality parameter = 7.849
## Type I error rate = 0.05
## Type II error rate = 0.2
Effect size: Cramer’s V (or Cohen’s W) for independence test (j x k contingency tables).
How many subjects are needed to detect the relationship between
depression severity and gender?
Check the original article at https://doi.org/10.1016/j.jad.2019.11.121
Option 1: Use cell probabilities (from the article).
prob.mat <- cbind(c(0.6759, 0.1559, 0.1281, 0.0323, 0.0078),
c(0.6771, 0.1519, 0.1368, 0.0241, 0.0101))
rownames(prob.mat) <- c("Normal", "Mild", "Moderate", "Severe", "Extremely Severe")
colnames(prob.mat) <- c("Female", "Male")
prob.mat
## Female Male
## Normal 0.6759 0.6771
## Mild 0.1559 0.1519
## Moderate 0.1281 0.1368
## Severe 0.0323 0.0241
## Extremely Severe 0.0078 0.0101
pwrss.chisq.gofit(p1 = prob.mat,
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 13069
## ------------------------------
## Degrees of freedom = 4
## Non-centrality parameter = 11.935
## Type I error rate = 0.05
## Type II error rate = 0.2
Option 2: Use Cramer’s V = 0.03022008 based on 5 x 2 contingency
table
Degrees of freedom is (nrow - 1) * (ncol - 1) for Cramer’s
V.
pwrss.chisq.gofit(w = 0.03022008, df = 4,
alpha = 0.05, power = 0.80)
## Pearson's Chi-square Goodness-of-fit Test
## for Contingency Tables
## ------------------------------
## Statistical power = 0.8
## Total n = 13069
## ------------------------------
## Degrees of freedom = 4
## Non-centrality parameter = 11.935
## Type I error rate = 0.05
## Type II error rate = 0.2
One-sided Test: Assume that the expected correlation is 0.20 and it is greater than 0.10. What is the minimum required sample size?
pwrss.z.corr(r = 0.20, r0 = 0.10,
power = 0.80, alpha = 0.05,
alternative = "greater")
## A Correlation against a Constant (z Test)
## H0: r = r0
## HA: r > r0
## ------------------------------
## Statistical power = 0.8
## n = 593
## ------------------------------
## Alternative = "greater"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Two-sided Test: Assume that the expected correlation is 0.20 and it is different from 0 (zero). The correlation could be 0.20 as well as -0.20. What is the minimum required sample size?
pwrss.z.corr(r = 0.20, r0 = 0,
power = 0.80, alpha = 0.05,
alternative = "not equal")
## A Correlation against a Constant (z Test)
## H0: r = r0
## HA: r != r0
## ------------------------------
## Statistical power = 0.8
## n = 194
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = 2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
Assume that the expected correlations in the first and second groups
are 0.30 and 0.20, respectively (r1 = 0.30
and
r2 = 0.20
).
One-sided Test: Expecting r1 - r2
greater than 0 (zero). The difference could be 0.10 but could not be
-0.10. What is the minimum required sample size?
pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
power = .80, alpha = 0.05,
alternative = "greater")
## Difference between Two Correlations
## (Independent Samples z Test)
## H0: r1 = r2
## HA: r1 > r2
## ------------------------------
## Statistical power = 0.8
## n1 = 1088
## n2 = 1088
## ------------------------------
## Alternative = "greater"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Two-sided Test: Expecting r1 - r2
different from 0 (zero). The difference could be -0.10 as well as 0.10.
What is the minimum required sample size?
pwrss.z.2corrs(r1 = 0.30, r2 = 0.20,
power = .80, alpha = 0.05,
alternative = "not equal")
## Difference between Two Correlations
## (Independent Samples z Test)
## H0: r1 = r2
## HA: r1 != r2
## ------------------------------
## Statistical power = 0.8
## n1 = 1380
## n2 = 1380
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = 2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
In the following examples p
is the proportion under
alternative hypothesis and p0
is the proportion under null
hypothesis.
One-sided Test: Expecting p - p0
smaller than 0 (zero).
# normal approximation
pwrss.z.prop(p = 0.45, p0 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "less",
arcsin.trans = FALSE)
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p = p0
## HA: p < p0
## ------------------------------
## Statistical power = 0.8
## n = 613
## ------------------------------
## Alternative = "less"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
# arcsine transformation
pwrss.z.prop(p = 0.45, p0 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "less",
arcsin.trans = TRUE)
## Approach: Arcsine Transformation
## A Proportion against a Constant (z Test)
## H0: p = p0
## HA: p < p0
## ------------------------------
## Statistical power = 0.8
## n = 617
## ------------------------------
## Alternative = "less"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Two-sided Test: Expecting p - p0
smaller than 0 (zero) or greater than 0 (zero).
pwrss.z.prop(p = 0.45, p0 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "not equal")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p = p0
## HA: p != p0
## ------------------------------
## Statistical power = 0.8
## n = 778
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
Non-inferiority Test: The case when smaller
proportion is better. Expecting p - p0
smaller than
0.01.
pwrss.z.prop(p = 0.45, p0 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "non-inferior")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p - p0 <= margin
## HA: p - p0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 426
## ------------------------------
## Alternative = "non-inferior"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Non-inferiority Test: The case when bigger
proportion is better. Expecting p - p0
greater than
-0.01.
pwrss.z.prop(p = 0.55, p0 = 0.50, margin = -0.01,
alpha = 0.05, power = 0.80,
alternative = "non-inferior")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p - p0 <= margin
## HA: p - p0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 426
## ------------------------------
## Alternative = "non-inferior"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority Test: The case when smaller proportion
is better. Expecting p - p0
smaller than -0.01.
pwrss.z.prop(p = 0.45, p0 = 0.50, margin = -0.01,
alpha = 0.05, power = 0.80,
alternative = "superior")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p - p0 <= margin
## HA: p - p0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 957
## ------------------------------
## Alternative = "superior"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority Test: The case when bigger proportion is
better. Expecting p - p0
greater than 0.01.
pwrss.z.prop(p = 0.55, p0 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "superior")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: p - p0 <= margin
## HA: p - p0 > margin
## ------------------------------
## Statistical power = 0.8
## n = 957
## ------------------------------
## Alternative = "superior"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Equivalence Test: Expecting p - p0
between -0.01 and 0.01.
pwrss.z.prop(p = 0.50, p0 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "equivalent")
## Approach: Normal Approximation
## A Proportion against a Constant (z Test)
## H0: |p - p0| >= margin
## HA: |p - p0| < margin
## ------------------------------
## Statistical power = 0.8
## n = 21410
## ------------------------------
## Alternative = "equivalent"
## Non-centrality parameter = -2.926 2.926
## Type I error rate = 0.05
## Type II error rate = 0.2
In the following examples p1
and p2
are
proportions for the first and second groups under alternative
hypothesis. The null hypothesis state p1 = p2
or
p1 - p2 = 0
.
One-sided Test: Expecting p1 - p2
smaller than 0 (zero).
# normal approximation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "less",
arcsin.trans = FALSE)
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 = p2
## HA: p1 < p2
## ------------------------------
## Statistical power = 0.8
## n1 = 1231
## n2 = 1231
## ------------------------------
## Alternative = "less"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
# arcsine transformation
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "less",
arcsin.trans = TRUE)
## Approach: Arcsine Transformation
## Cohen's h = -0.1
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 = p2
## HA: p1 < p2
## ------------------------------
## Statistical power = 0.8
## n1 = 1233
## n2 = 1233
## ------------------------------
## Alternative = "less"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Two-sided Test: Expecting p1 - p2
smaller than 0 (zero) or greater than 0 (zero).
pwrss.z.2props(p1 = 0.45, p2 = 0.50,
alpha = 0.05, power = 0.80,
alternative = "not equal")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 = p2
## HA: p1 != p2
## ------------------------------
## Statistical power = 0.8
## n1 = 1562
## n2 = 1562
## ------------------------------
## Alternative = "not equal"
## Non-centrality parameter = -2.802
## Type I error rate = 0.05
## Type II error rate = 0.2
Non-inferiority Test: The case when smaller
proportion is better. Expecting p1 - p2
smaller than
0.01.
pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "non-inferior")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 - p2 <= margin
## HA: p1 - p2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 855
## n2 = 855
## ------------------------------
## Alternative = "non-inferior"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Non-inferiority Test: The case when bigger
proportion is better. Expecting p1 - p2
greater than
-0.01.
pwrss.z.2props(p1 = 0.55, p2 = 0.50, margin = -0.01,
alpha = 0.05, power = 0.80,
alternative = "non-inferior")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 - p2 <= margin
## HA: p1 - p2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 855
## n2 = 855
## ------------------------------
## Alternative = "non-inferior"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority Test: The case when smaller proportion
is better. Expecting p1 - p2
smaller than -0.01.
pwrss.z.2props(p1 = 0.45, p2 = 0.50, margin = -0.01,
alpha = 0.05, power = 0.80,
alternative = "superior")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 - p2 <= margin
## HA: p1 - p2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 1923
## n2 = 1923
## ------------------------------
## Alternative = "superior"
## Non-centrality parameter = -2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Superiority Test: The case when bigger proportion is
better. Expecting p1 - p2
greater than 0.01.
pwrss.z.2props(p1 = 0.55, p2 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "superior")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: p1 - p2 <= margin
## HA: p1 - p2 > margin
## ------------------------------
## Statistical power = 0.8
## n1 = 1923
## n2 = 1923
## ------------------------------
## Alternative = "superior"
## Non-centrality parameter = 2.486
## Type I error rate = 0.05
## Type II error rate = 0.2
Equivalence Test: Expecting p1 - p2
between -0.01 and 0.01.
pwrss.z.2props(p1 = 0.50, p2 = 0.50, margin = 0.01,
alpha = 0.05, power = 0.80,
alternative = "equivalent")
## Approach: Normal Approximation
## Difference between Two Proportions
## (Independent Samples z Test)
## H0: |p1 - p2| >= margin
## HA: |p1 - p2| < margin
## ------------------------------
## Statistical power = 0.8
## n1 = 42820
## n2 = 42820
## ------------------------------
## Alternative = "equivalent"
## Non-centrality parameter = -2.926 2.926
## Type I error rate = 0.05
## Type II error rate = 0.2
Please send any bug reports, feedback or questions to bulusmetin [at] gmail.com.
–o–
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License .