# SPSS Tutorials

BASICS REGRESSION T-TEST ANOVA CORRELATION

# Assumption of Infinity

## Summary

For most statistical tests, it is quietly assumed that your sample size is less than some 10% of your population size. If this doesn't hold, some test results may be severely biased unless you use a special correction.

## Introduction

Many of the statistical tests we commonly use, are based on rigorous mathematical assumptions. One of those is that population sizes are assumed to be infinite. In the real world, of course, many populations consist of a limited number of objects. A lot of research thus violates the infinity assumption.
Does this bias research conclusions? The answer is: not really for samples that are much smaller than the populations they represent. However, for samples larger than some 10% of the population size, some ‘finity bias’ occurs. Such finity bias overestimates the standard errors of parameter estimates (but does not affect the estimates themselves).
The remainder of this tutorial discusses the three scenarios that apply to most real world research.

## 1. Sample holds Small Proportion of Population

In this scenario, implicitly assumed by most statistical tests, finity bias is negligible and therefore ignored. So how small is small? This is a difficult question; bias decreases gradually as the sampling proportion decreases. Opinions differ but it's sometimes suggested that a sample containing less than 10% of a population is small enough to avoid ‘finity bias’.

## 2. Sample Holds Large Proportion of Population

Statistical test results may be severely biased if your sample size is larger than some 10% of your population size. If this holds, you'll need to apply a correction factor to your standard errors.In practice, you'll (hopefully) have your software package do the math for you. In SPSS, this finity correction is implemented in the complex samples option. A formula that's often used for this is $$se_{c} = se \sqrt{N - n \over N}$$
where

• $$se_{c}$$ is the corrected standard error;
• $$se$$ is the uncorrected standard error;
• $$N$$ is the population size;
• $$n$$ is the sample size.

What the formula basically says, is that standard errors become smaller as the sampling proportion becomes larger. This makes perfect sense; insofar as the sample size gets closer to the population size, sample outcomes will have less room to deviate from their population counterparts.
Note that the corrected standard error resolves to zero if the sample is the population. This is the final scenario we'll discuss.

## 3. Sample is Population

In some cases, the data we're analyzing contain the entire population we'd like to know something about. In this case, significance tests don't make any sense whatsoever and neither do confidence intervals.
To see why this is so, remember that such statistical procedures aim at generalizing sample outcomes to their population counterparts. However, if the sample is the population, then sample outcomes can't be any different from their population counterparts.However, we're not taking the possibility of measurement error into account here.

# Let me know what you think!

*Required field. Your comment will show up after approval from a moderator.