Parametric tests are statistical hypothesis tests that assume the underlying data follow a specific distribution (typically normal). They are used to compare means, variances, or proportions across groups. The choice of test depends on sample size, number of groups, whether population parameters are known, and assumptions about equality of variances. Z-test and T-test compare two means; F-test compares variances; ANOVA extends the t-test to three or more groups. These tests are fundamental to business research for evaluating interventions, comparing segments, and testing relationships. Proper test selection ensures valid conclusions and minimizes Type I and Type II errors.
-
Z-Test
The Z-test is a parametric test used to determine whether the mean of a population differs from a known standard (one-sample) or whether two population means differ when the population standard deviation (σ) is known and sample size is large (typically n ≥ 30). It is based on the standard normal distribution. One-sample Z-test formula: z = (x̄ – μ) / (σ/√n), where x̄ = sample mean, μ = population mean, σ = population standard deviation. Two-sample Z-test formula: z = (x̄₁ – x̄₂) / √(σ₁²/n₁ + σ₂²/n₂). Proportion Z-test: z = (p̂ – π) / √(π(1-π)/n) for one proportion; for two proportions, compare differences.
Assumptions: Data are independent; sample size large (Central Limit Theorem ensures normality of sampling distribution); population standard deviation known (rare in practice); random sampling.
Applications in business: Comparing sample mean to industry benchmark (known population parameters); A/B testing with very large samples (n > 100 per group); testing market share against target; quality control (comparing defect rate to standard). For example, a retailer knows from historical data that average customer spend is ₹1,000 (σ = ₹200). A sample of 100 customers after a promotion shows x̄ = ₹1,050. Z = (1050-1000)/(200/10)=2.5, p=0.012 → significant increase.
Limitations: Requires known population variance (rarely available); less common than t-test in business research; for unknown σ, use t-test.
-
T-Test
The t-test is a parametric test used to compare means when the population standard deviation (σ) is unknown and estimated from the sample (s). It uses the t-distribution, which has heavier tails than the normal distribution, especially for small samples.
Three types: (1) One-sample t-test: Compares sample mean to a known or hypothesized population mean. Formula: t = (x̄ – μ) / (s/√n), df = n-1. (2) Independent (two-sample) t-test: Compares means of two independent groups. Formula: t = (x̄₁ – x̄₂) / (s_p × √(1/n₁ + 1/n₂)), where s_p is pooled standard deviation. df = n₁ + n₂ – 2. (3) Paired (dependent) t-test: Compares means of two related groups (same subjects measured twice, matched pairs). Formula: t = (d̄) / (s_d/√n), where d̄ = mean difference, df = n-1.
Assumptions: Normality (or n ≥ 30 per group for robustness); independence of observations; for independent t-test, homogeneity of variances (Levene’s test); for paired t-test, differences should be normal.
Applications: Comparing customer satisfaction before/after service change (paired); comparing satisfaction between two stores (independent); testing whether employee engagement differs from industry norm (one-sample).
Effect size: Cohen’s d = (x̄₁ – x̄₂) / s_pooled (0.2 small, 0.5 medium, 0.8 large). Report t, df, p-value, and d.
-
F-Test
The F-test is a parametric test that compares variances (or variability) between two or more populations. It is based on the F-distribution, which is the ratio of two chi-square distributions. The most common use is testing equality of variances (homogeneity of variance) before conducting t-tests or ANOVA.
Formula: F = s₁² / s₂², where s₁² is the larger variance (numerator) and s₂² is the smaller variance (denominator). F ≥ 1 always. Degrees of freedom: df₁ = n₁ – 1, df₂ = n₂ – 1. A significant F (p < 0.05) indicates variances are unequal, violating an assumption of t-test and ANOVA.
Other uses: (1) Overall F-test in regression: Tests whether all regression coefficients (except intercept) are simultaneously zero. F = (MSR) / (MSE), where MSR = regression mean square, MSE = error mean square. Significant F means at least one predictor explains variance. (2) F-test for nested models: Compares a reduced model (fewer predictors) to a full model. (3) Two-sample variance comparison: e.g., testing whether the variance of product weights is equal across two production lines.
Assumptions: Normality of populations; independent random samples.
Applications in business: Quality control (comparing variability across suppliers, shifts, or machines); regression model significance testing; checking assumptions before ANOVA. For example, testing if variance in delivery times differs between two warehouses (F = 1.8, p = 0.03 → variances differ). Note: F-test for variances is sensitive to non-normality; Levene’s test is a more robust alternative.
-
ANOVA (Analysis of Variance)
ANOVA (Analysis of Variance) is a parametric test that compares means across three or more independent groups simultaneously. It extends the t-test (which handles only two groups) while controlling Type I error that would accumulate from multiple pairwise t-tests. One-way ANOVA: One independent variable (factor) with three or more levels (categories).
Formula: F = MS_between / MS_within, where MS_between = variance explained by group differences, MS_within = error variance (within-group). If F is significant (p < α), at least one group mean differs from others.
Follow-up tests: Post-hoc comparisons (Tukey HSD, Bonferroni) identify which specific groups differ.
Assumptions: Independence of observations; normality within each group (or n ≥ 30 per group); homogeneity of variances (Levene’s test; if violated, use Welch’s ANOVA or Kruskal-Wallis).
Types: (1) One-way ANOVA (one factor). (2) Two-way ANOVA (two factors, tests main effects and interaction). (3) Repeated measures ANOVA (same subjects measured under multiple conditions). (4) MANOVA (multiple dependent variables).
Applications in business: Comparing customer satisfaction across three store locations; testing sales effectiveness of four advertising campaigns; evaluating employee engagement across five departments; analyzing product preference across age groups (e.g., 18–30, 31–45, 46–60).
Effect size: η² (eta-squared) = SS_between / SS_total (0.01 small, 0.06 medium, 0.14 large). Report F, df, p-value, and η². Non-significant ANOVA (p > α) means no evidence of group mean differences.
Non-Parametric Tests
Non-parametric tests (distribution-free tests) do not assume normality or specific population distributions. They are used when parametric test assumptions are violated (non-normal data, small samples, ordinal scales). They work with ranks or frequencies rather than raw values. While generally less powerful than parametric tests (require larger samples to detect the same effect), they are more robust and applicable to a wider range of data types, including nominal and ordinal measurements. Common non-parametric tests include chi-square (frequencies), sign test (median differences), Mann-Whitney U (two independent groups), Kruskal-Wallis (three or more groups), and Wilcoxon signed-rank (paired/repeated measures).
- Chi-Square Test (χ²)
The chi-square test (χ²) is a non-parametric test for analyzing categorical (nominal or ordinal) data. It compares observed frequencies to expected frequencies under the null hypothesis.
Two common types:
(1) Chi-square goodness-of-fit test: Determines whether a single categorical variable matches an expected distribution (e.g., market share 40%, 35%, 25%). Formula: χ² = Σ[(O – E)²/E], df = k-1.
(2) Chi-square test of independence: Tests whether two categorical variables are associated (e.g., gender and brand preference). Formula same, df = (r-1)(c-1) where r = rows, c = columns.
Assumptions: Random sampling; expected frequencies ≥ 5 per cell (if violated, use Fisher’s exact test); independent observations.
Applications: Market share analysis; customer segmentation; preference differences across demographic groups; testing association between satisfaction (satisfied/unsatisfied) and repeat purchase (yes/no).
Effect size: Cramér’s V (0.1 small, 0.3 medium, 0.5 large).
- Sign Test
The sign test is a simple non-parametric test for paired or repeated measures data. It tests whether the median difference between two related conditions is zero, using only the direction (sign) of differences, not magnitude.
Procedure: For each pair, record whether the difference is positive (+), negative (-), or zero (discard zeros). Count n = total non-zero pairs. Under H₀ (no difference), the number of positive signs follows a binomial distribution with p = 0.5. Compare observed positives to binomial critical value or compute exact p-value. For large n (≥20), use normal approximation with continuity correction.
Assumptions: Pairs are independent; differences need not be normal.
Applications: Before/after studies without normality (e.g., customer satisfaction pre/post intervention measured on ordinal scale); comparing two products (preference direction only); taste tests.
Limitations: Ignores magnitude of change, reducing power. For paired data with normal differences, paired t-test is more powerful. Effect size not standard; report proportion of positive signs.
- Mann-Whitney U-Test
The Mann-Whitney U test (also called Wilcoxon rank-sum test) compares the distributions of two independent groups when the dependent variable is ordinal or continuous but non-normal. It tests whether one group tends to have larger values than the other (stochastic dominance).
Procedure: Combine all observations from both groups, rank them from smallest to largest (ties receive average ranks). Sum ranks for each group: R₁ and R₂. Calculate U₁ = n₁n₂ + [n₁(n₁+1)/2] – R₁; U₂ = n₁n₂ – U₁. U = min(U₁, U₂). For large samples (n₁, n₂ > 20), approximate z-statistic.
Assumptions: Independent random samples; ordinal or continuous data; distributions have same shape (for interpreting as median difference).
Applications: Comparing customer satisfaction scores (ordinal Likert) between two stores; testing salary differences between genders (non-normal data); comparing time spent on website across two user groups.
Effect size: r = Z/√N (0.1 small, 0.3 medium, 0.5 large) or rank-biserial correlation.
- Kruskal-Wallis Test
The Kruskal-Wallis test is the non-parametric equivalent of one-way ANOVA for comparing three or more independent groups. It tests whether samples come from populations with the same median (or same distribution shape).
Procedure: Combine all observations from all groups, rank them (lowest to highest). Compute sum of ranks for each group (R_j). Calculate H statistic: H = [12/(N(N+1))] × Σ(R_j²/n_j) – 3(N+1), where N = total sample size, n_j = size of group j. For large samples and no ties, H follows chi-square distribution with df = k-1 (k = number of groups). For ties, use correction factor.
Assumptions: Independent random samples; ordinal or continuous data; distributions have similar shape (for median interpretation).
Post-hoc tests: Dunn’s test with Bonferroni correction for pairwise comparisons after significant H.
Applications: Comparing customer satisfaction across multiple store locations; testing employee engagement across departments (ordinal data); comparing product preference ratings across four age groups. Effect size: η²_H = (H – k + 1)/(N – k). Report H, df, p-value.
- Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is the non-parametric equivalent of the paired t-test for two related (paired) samples or repeated measures. It considers both direction and magnitude of differences, making it more powerful than the sign test.
Procedure: For each pair, calculate difference (d). Discard zero differences. Rank the absolute differences (|d|) from smallest to largest. Assign signs (+ or -) back to ranks based on original difference direction. Sum positive ranks (W⁺) and negative ranks (W⁻). Test statistic W = min(W⁺, W⁻) or W = W⁺ (depending on software). For n > 20, approximate z-statistic.
Assumptions: Pairs are independent; differences are symmetric about median (for paired data); ordinal or continuous data (not necessarily normal).
Applications: Before/after studies with non-normal data (e.g., customer satisfaction pre/post intervention measured on Likert scale); comparing two product ratings from same respondents; testing weight loss (pre/post) with small sample. Effect size: r = Z/√N. Report W, z (if n > 20), p-value, and median difference.

