A frequency distribution is a table listing each distinct value of some variable and the number of times it occurs in some dataset. In short, a frequency distribution is a table showing how frequencies are distributed over values. Let's take a look at some examples.
Frequency Distribution - Example
We recently taught a course to 183 students and we asked them to evaluate it by filling out a brief questionnaire. Part of the data we collected are shown below.
One of the first ways to gain some insight into our data is inspecting which values are present in our variables and how often these occur. That is, we'll inspect frequency distributions, one of which is shown below.
Unsurprisingly, our variable “sex” contains two values: female and male. Note that a “value” may be a number or -in this case- a word. Roughly half of our 183 respondents are female.
Frequency Distribution - Cumulative Percentages
Our previous table showed the frequencies for female and male respondents, their percentages and their cumulative percentages. A cumulative percentage is the percentage for some value plus the percentages for all values that precede it. Importantly, which values “precede” some value depends on the order in which we choose to list our values. For example, the table below is a frequency distribution for (study) major, sorted alphabetically.
Note that the cumulative frequency for economy is 39.3%. However, take a look what happens if we sort our table by descending frequency.
The cumulative percentage for economy is now 73.2% even though these are the exact same data as previously. By sorting the table differently, both psyhology and anthropology now precede economy, affecting the cumulative percentages.
Frequency Distribution - Bar Charts
A common way for visualizing frequency distributions are bar charts. As shown in the figure below, it gives insight into your data much faster than a table.
Frequency Distribution - Pie Charts
An alternative visualization for a frequency distribution is a pie chart. The figure below shows an example.
Frequency Distribution - Metric Variables
Frequency distributions are mostly used for categorical variables. In some cases, running frequency distributions also works well for metric variables. For instance, “How many children do you have?” will probably yield values between zero and perhaps 4 or 5. With 5 or 6 distinct values, our frequency distribution will have 5 or 6 rows and will nicely summarize our variable.
However, most metric variables will result in a huge frequency table that doesn't give any insight into our data whatsoever. This is because metric variables (such as monthly income in dollars) typically have many distinct values and each of those will show up in our table. A more suitable choice for such variables are histograms and descriptive statistics.