Key terms and concepts for research design and statistical analysis
The significance level or probability of making a Type I error (rejecting a true null hypothesis). Commonly set at 0.05, meaning a 5% chance of false positive results.
Example: Setting α = 0.05 means you accept a 5% risk of concluding there is an effect when there actually is none.
The probability of making a Type II error (failing to reject a false null hypothesis). Power is calculated as 1 - β.
Example: If β = 0.20, there is a 20% chance of missing a true effect (and 80% power to detect it).
The probability that a statistical test will detect an effect when there is one. Conventionally set at 0.80 (80%) or 0.90 (90%).
Example: A study with 80% power has an 80% chance of detecting a true effect of the expected size.
The probability of obtaining results at least as extreme as those observed, assuming the null hypothesis is true. A smaller p-value provides stronger evidence against the null hypothesis.
Example: A p-value of 0.03 means there is a 3% probability of observing these results (or more extreme) if the null hypothesis were true.
A quantitative measure of the magnitude of a phenomenon or the strength of a relationship. Unlike p-values, effect sizes are independent of sample size and indicate practical significance.
Example: Cohen's d = 0.5 indicates the treatment group scored half a standard deviation higher than the control group.
A standardized measure of the difference between two means, expressed in standard deviation units. Values of 0.2, 0.5, and 0.8 are conventionally considered small, medium, and large effects.
Example: d = 0.65 indicates a medium-to-large effect, meaning the groups differ by 0.65 standard deviations.
A range of values that likely contains the true population parameter with a specified level of confidence (typically 95%). Provides information about precision and uncertainty.
Example: A 95% CI of [2.3, 4.7] means we are 95% confident the true population mean falls between 2.3 and 4.7.
The default assumption that there is no effect or no difference between groups. Statistical tests aim to provide evidence against the null hypothesis.
Example: H₀: The new treatment has no effect on symptoms (mean difference = 0).
The research hypothesis that contradicts the null hypothesis, stating there is an effect or difference.
Example: H₁: The new treatment reduces symptoms more than placebo (mean difference ≠ 0).
A false positive: rejecting the null hypothesis when it is actually true. The probability of Type I error is α (alpha).
Example: Concluding a drug is effective when it actually has no effect.
A false negative: failing to reject the null hypothesis when it is actually false. The probability of Type II error is β (beta).
Example: Concluding a drug is ineffective when it actually works.
A measure of variability indicating how spread out data points are from the mean. Larger SD indicates more variability.
Example: Test scores with SD = 15 are more spread out than scores with SD = 5.
The standard deviation of the sampling distribution. It estimates how much sample means vary from the true population mean.
Example: A smaller SE indicates the sample mean is a more precise estimate of the population mean.
A measure of the strength and direction of the linear relationship between two variables. Ranges from -1 (perfect negative) to +1 (perfect positive).
Example: r = 0.70 indicates a strong positive correlation between study time and test scores.
The proportion of variance in the dependent variable explained by the independent variable(s). Ranges from 0 to 1.
Example: R² = 0.36 means 36% of the variance in test scores is explained by study time.
An effect size measure for ANOVA indicating the proportion of total variance explained by group differences. Values of 0.01, 0.06, and 0.14 are considered small, medium, and large.
Example: η² = 0.08 means group membership explains 8% of the variance in the outcome.
A test statistic for categorical data, measuring how much observed frequencies differ from expected frequencies. Used in goodness-of-fit and independence tests.
Example: χ² = 8.5 with p = 0.014 suggests observed category frequencies differ significantly from expected.
An effect size measure for chi-square tests, ranging from 0 (no association) to 1 (perfect association).
Example: V = 0.25 indicates a weak-to-moderate association between two categorical variables.
The number of independent values that can vary in a statistical calculation. Used to determine critical values for test statistics.
Example: For a t-test comparing two groups of 30 each, df = 58 (n₁ + n₂ - 2).
A statistical test comparing means across three or more groups. Tests whether group means differ more than expected by chance.
Example: One-way ANOVA with F(2,87) = 5.32, p = 0.007 indicates significant differences among three teaching methods.
A statistical test comparing means between two groups or comparing a sample mean to a known value.
Example: t(48) = 2.31, p = 0.025 suggests the treatment group scored significantly higher than controls.
A method for modeling the relationship between a dependent variable (Y) and one or more independent variables (X). The slope (b) indicates how much Y changes for each unit change in X.
Example: Y = 50 + 3X means for each 1-unit increase in X, Y increases by 3 points, starting from a baseline of 50.
A statistical test that considers both directions of difference (greater than or less than). Used when the direction of effect is not predicted.
Example: Testing whether a drug affects blood pressure (either increasing or decreasing it).
A statistical test that considers only one direction of difference (greater than OR less than, but not both). Used when direction is predicted.
Example: Testing whether a new teaching method improves (not just changes) test scores.
The number of observations or participants in a study. Larger samples provide more precise estimates and greater power to detect effects.
Example: A study with n = 200 has more power than one with n = 50, assuming the same effect size.
The amount of random sampling error in a survey result, typically expressed as ± percentage points. Larger samples have smaller margins of error.
Example: ±3% margin of error means the true population value is likely within 3 percentage points of the sample estimate.
When a result is unlikely to have occurred by chance alone (typically p < 0.05). Does not necessarily imply practical importance.
Example: A statistically significant difference (p = 0.001) might be too small to matter in practice.
Whether an effect is large enough to be meaningful in real-world contexts. Determined by effect size, not p-values.
Example: A 1-point improvement on a 100-point scale might be statistically significant but not practically meaningful.
For more planning and analysis tools, visit coverletterpro.io.
Use our calculators to apply these statistical concepts to your research.