Simple Statistics Dictionary

ANCOVA (Analysis of Covariance)

ANCOVA (analysis of covariance) tests if 2+ population means are equal while controlling for 1+ background variables.

Example: do medicines A, B and C result in equal mean blood pressures when controlling for age?

ANCOVA basically combines ANOVA and regression. This tutorial walks you through the analysis with an example in SPSS.

ANOVA (Analysis of Variance)

ANOVA (analysis of variance) tests if 3+ population means are all equal.

Example: do the pupils of schools A, B and C have equal mean IQ scores?

This super simple introduction quickly walks you through the basics such as assumptions, null hypothesis and post hoc tests.

ANOVA (Repeated Measures)

Repeated measures ANOVA tests if 3+ variables have equal means in some population.

Example: are the mean scores on IQ tests A, B and C equal for all Dutch children?

This simple introduction quickly walks you through the basics.

Binomial Test

A binomial test examines if a population percentage is equal to x.

Example: is 45% of all Amsterdam citizens currently single? Or is it a different percentage?

This simple tutorial quickly walks you through the basics.

Boxplot

A boxplot is a chart showing

quartiles;
potential outliers;
extreme values;
and some other statistics.

This tutorial quickly walks you through with some examples.

Chi-Square Goodness-of-Fit Test

A chi-square goodness-of-fit test examines if a categorical variable has some hypothesized frequency distribution in some population.

Chi-Square Independence Test

A chi-square independence test evaluates if two categorical variables are related in some population.

This simple introduction explains how the test basically works and how to run and interpret it.

SPSS Cochran's Q test is a procedure for testing whether the proportions of 3 or more dichotomous variables are equal. These outcome variables have been measured on the same people or other statistical units.

Cohen’s D

Cohen’s D is the effect size measure of choice for t-tests.

This simple tutorial quickly walks you through

rules of thumb for small, medium and large effects;
formulas for computing Cohen’s D and;
software options for obtaining it.

Cohen’s Kappa

Cohen’s kappa tells to what extent 2 ratings agree better than chance level.

Simple tutorial with calculation examples, formulas & effect size rules.

Confidence Interval

A confidence interval (CI) is a range of values that encloses a parameter with a given likelihood. Example: the 95% CI runs from 586 through 612 grams.
Read more...

Covariance

A covariance is basically an unstandardized correlation.

This tutorial covers

when and why to use covariances instead of correlations;
basic formulas for covariances;
obtaining covariances from Googlesheets and SPSS;
and way more...

Cramér’s V

Cramér’s V is a number between 0 and 1 that indicates how strongly two nominal variables are correlated.

Because it's suitable for categorical variables, Cramér’s V is often used as an effect size measure for a chi-square independence test.

Dichotomous Variable

Dichotomous variables are variables that hold precisely two distinct values.

Example: sex can only be male or female.

Some analyses that are only suitable for dichotomous variables are

Effect Size

Effect size is an interpretable number that quantifies the difference between data and some hypothesis.

Effect size measures are useful for comparing effects across and within studies. This tutorial helps you to choose, obtain and interpret an effect size for each major statistical procedure.

Eta Squared

(Partial) eta squared is an effect size measure for one-way or factorial ANOVA. This tutorial shows 2 easy ways to get it from SPSS.
Read more...

Frequency Distribution

A frequency distribution is an overview of all values in some variable and how often these occur.

Like so, a frequency distribution shows how frequencies are distributed over values. This tutorial quickly makes things clear with some simple examples.

Histogram

A histogram is a chart showing frequencies for fixed width intervals of a metric variable. This tutorial explains what histograms are and demonstrates why they are useful with illustrations and examples.

Inferential Statistics

Inferential statistics is the branch of statistics that tries to draw conclusions (inferences) about populations based on (much smaller) samples.
Read more...

Kendall’s Concordance Coefficient W

Kendall’s Concordance Coefficient W is a number between 0 and 1 that indicates interrater agreement.

This tutorial explains the basic idea behind Kendall’s W and shows how to get it from SPSS.

Kendall’s Tau

Kendall’s Tau is a number between -1 and +1 that indicates to what extent 2 variables are monotonously related.

This tutorial quickly walks you through some basics such as assumptions, significance and confidence intervals for Kendall’s Tau.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov test examines if a variable is normally distributed in some population.

This “normality assumption” is required for t-tests, ANOVA and many other tests. This tutorial shows how to run and interpret a Kolmogorov-Smirnov test in SPSS with some simple examples.

Kurtosis

In statistics, kurtosis refers to the “peakedness” of a distribution.

This quick introduction walks you through with illustrations, formulas and a calculation example.

Levene’s Test

Levene’s test examines if 2+ populations have equal variances on some variable.

This condition -known as the homogeneity of variance assumption- is required by t-tests and ANOVA.

So how to run and interpret this test in SPSS? This simple tutorial quickly walks you through.

Logistic Regression

Logistic regression is a technique for predicting a dichotomous outcome variable from 1+ predictors.

This simple introduction quickly walks you through all logistic regression basics with a downloadable example analysis.

Mahalanobis distance

Mahalanobis Distances are used for detecting multivariate outliers.

In SPSS, they're found under Analyze - Regression - Linear - Save.

Mann-Whitney Test

The Mann-Whitney test is an alternative for the independent samples t-test when the assumptions required by the latter aren't met by the data. The most common scenario is testing a non normally distributed outcome variable in a small sample (say, n < 25).

McNemar Test

SPSS McNemar test is a procedure for testing whether the proportions of two dichotomous variables are equal. The two variables have been measured on the same cases.

Measurement Levels

Measurement levels are types of variables that tell you how they should be analyzed. There's 4 types:

nominal variables;
ordinal variables;
interval variables;
ratio variables.

This tutorial quickly walks you through with a simple flowchart and some examples.

Median

The median is basically the value that separates the 50% lowest from the 50% highest values.

Example: a median income of $2,500 means that 50% of all people earn less and 50% earn more than that amount.

Median Test for 2 Independent Medians

SPSS median test evaluates if two groups of respondents have equal population medians on some variable. This easy tutorial quickly walks you through.

Mode

In statistics, the mode for a variable is its value or range of values with the highest frequency. Find the mode in SPSS & Excel with simple examples.
Read more...

Normal Distribution

The normal distribution is a bell-shaped probability density function.

This tutorial quickly covers all you need to know such as

looking up probabilities from a normal distribution;
basic formulas and software;
statistical tests for “normality”.

Null Hypothesis

A null hypothesis is an exact statement about a population that we try to reject with sample data.

Example: 20% of some population carry virus X. If a sample from this population shows a very different percentage, then we reject this null hypothesis.

Pearson Correlation

A Pearson correlation is a number between -1 and +1 that indicates how strongly two variables are linearly related.

This simple tutorial quickly explains the basics with outstanding illustrations and examples.

Percentile

The nth percentile is the value that separates the lowest n% from all other values.

Example: the 10th percentile for body weight is 60 kilos. This means that 10% of all people weigh less than 60 kilos and 90% of people weigh more.

Simple tutorial with examples in Excel & SPSS and (interpolation) formulas.

Power (Statistics)

In statistics, power is the probability of rejecting a false null hypothesis.

This tutorial gently walks you through everything you need to know:

how does it work and why does it matter?
which factors affect power and how?
which software is best for computing power?
and way more...

Probability Density Function

A probability density function is a function from which probabilities for ranges of outcomes can be obtained.

Example: the probability is 95% that your IQ is between 70 and 130 points. This statement is based on the normal distribution -probably the best known probability density function.

So how does that work? And how do density functions differ from probability distribution functions? This tutorial quickly clears things up.

Sampling Distribution

A sampling distribution is the frequency distribution of a sample statistic (mean, standard deviation or other) over repeated samples. A sampling distribution tells us which outcomes we should expect, given our research hypothesis.

Shapiro-Wilk Test

The Shapiro-Wilk test examines if a variable is normally distributed in a population. This assumption is required by some statistical tests such as t-tests and ANOVA.

The SW-test is an alternative for the Kolmogorov-Smirnov test. This tutorial shows how to run and interpret it in SPSS.

Sign Test for 1 Median

SPSS sign test for one median the right way. Recode your outcome variable into values higher and lower than the hypothesized median and test if they're distribted 50/50 with a binomial test.

Sign Test for 2 Related Medians

SPSS sign test for two related medians tests if two variables measured in one group of people have equal population medians.

Significance

Statistical significance is roughly the probability of finding your data under some null hypothesis.

If this probability (or “p”) is low -usually p < 0.05- then your data contradict your null hypothesis. In this case, you conclude that the hypothesis is not true.

Simple Linear Regression

This tutorial gently walks you through the basics of simple regression: b and beta coefficients, the intercept and r-square (adjusted). Get this right and you'll get it all right.
Read more...

Simple Random Sampling

Popular statistical procedures such as ANOVA, a chi-square test or a t-test quietly rely on the assumption that your data are a simple random sample from your population. This tutorial walks you through simple random sampling in normal language.

Skewness

Skewness is a number that indicates to what extent a variable is asymmetrically distributed.

We'll quickly walk you through some examples, formulas and software options for obtaining skewness from raw data.

Spearman Rank Correlation

A Spearman rank correlation is a number between -1 and +1 that indicates to what extent 2 variables are monotonously related.

This tutorial quickly walks you through the basics such as assumptions, significance levels, software and more.

Standard Deviation

The standard deviation is a number that indicates how far a set of numbers lie apart. This tutorial explains what a standard deviation is in normal language along with examples.

T-Test (Independent Samples)

An independent samples t-test examines if 2 populations have equal means on some variable.

Example: do Dutch women have the same mean salary as Dutch men?

This tutorial quickly walks you through the basics such as the assumptions, null hypothesis and effect size for this test.

T-Test (One Sample)

A one-sample t-test examines if a population mean is likely to be x: some hypothesized value.

Example: do the pupils from my school have a mean IQ score of 100?

This tutorial quickly walks you through the basics for this test, including assumptions, formulas and effect size.

T-Test (Paired Samples)

A paired samples t-test examines if 2 variables have equal means in some population.

Example: were the mean salaries over 2018 and 2019 equal for all Dutch citizens?

This tutorial quickly walks you through the correct steps for running this test in SPSS.

Variance

The variance is a number that indicates how far a set of numbers lie apart. This tutorial explains the concept gently with examples and illustrations.

Wilcoxon Signed-Ranks Test

SPSS Wilcoxon Signed-Ranks test is used for comparing two metric variables measured on one group of cases. It's the nonparametric alternative for a paired-samples t-test when its assumptions aren't met.

Z-Scores

Z-scores are scores that have mean = 0 and standard deviation = 1.

All scores can be standardized into z-scores by subtracting the mean from each score and then dividing it by the standard deviation.

Such standardized scores may be easier to interpret than the original scores. Z-scores may or may not be normally distributed.

Z-Test for 2 Independent Proportions

A z-test for 2 independent proportions examines if some event occurs equally often in 2 subpopulations.

Example: do equal percentages of male and female students answer some exam question correctly?

This tutorial covers examples, assumptions and formulas and presents a simple Googlesheet for running z-tests the easy way.

Z-Test for Single Proportion

A z-test for a single proportion examines if a population proportion is likely to be x.

Example: does a proportion of 0.60 (or 60%) of some population have antibodies against Covid-19?

This simply tutorial quickly walks you through and includes examples, formulas, assumptions and the continuity and Agresti-Coull corrections.

Simple Statistics Dictionary – A to Z

A

B

C

D

E

F

H

I

K

L

M

N

P

S

T

V

W

Z