Representing Data
Think for a moment about the number five. If you need to visually communicate the number five to someone, how would you do it? You might write the numeral “5,” which is a symbol you likely learned as a child, or you might use the Roman numeral “V” or the symbol for five established in other writing systems. You might use tally marks, or other shapes. Now, how would you represent five million? We can write out the number several different ways, but it’s much harder to intuitively comprehend than five tally marks. When we start considering large numbers, or many numbers at once, then we need different symbols to understand those amounts.
Before we dive into making charts and spreadsheets and telling stories with data, we need to start with some overview definitions about what actually makes something “data visualization.” Follow along or access the lecture slides here.
Video not showing up? Watch on YouTube here.
Review: Data Types
Hopefully you read about different data types in Chapter 1 of Data + Design. This can be a little abstract, so let’s review with examples possibly related to your life.
Qualitative vs. Quantitative
Data + Design points out that these terms aren’t used consistently across all sources, but it’s still a useful distinction to make. If you’re collecting data about the food you eat, you could keep qualitative data such as “bagel sandwich and a strawberry-banana smoothie.” Or you could keep quantitative data such as “540 calories.” (It can be beneficial to track both, since each type gives us very different information.)
The gray area is when you devise systems for making qualitative data numerical. For example, hospitals often use a pain scale to have patients rank their perception of pain, and this is useful to track how a person’s pain changes over time. But since it’s based on perception, there’s no way to compare whether two patients who report a pain level of “7” are actually in the same amount of pain.
Discrete vs. Continuous
One way to remember discrete vs. continuous data is to think about making cookies. If a recipe calls 2 cups of sugar and you only have 1 cup, you can do the math to just make half the recipe. But if the recipe calls for 1 egg, that’s tricky, because it’s hard to measure half an egg. Flour can be measured in any amount (continuous), but eggs are measured in whole numbers (discrete).
In sports, any timed competition like running or swimming uses continuous data — you can beat a competitor by the tiniest fraction of a second. Any scored competition like football or basketball uses discrete data — you can’t score half a touchdown.
Nominal
This is the one often referred to as categorical data, because it’s about categories of things. If you want to track your monthly budget, it would likely help to designate categories such as food, housing, transportation, etc.
Ordinal
This is data organized on a scale, so you can think about it as categories with an order. Rating restaurants or products from 1-5 stars is an example of ordinal data that we’re familiar with, and so are letter grades.
Interval
This is similar to ordinal data, but with equal and measurable differences between values. Temperature and time are good examples — 10 degrees Fahrenheit is colder than 30 degrees Fahrenheit, and you can accurately say it’s the same amount colder than 30 degrees compared to 50 degrees. The amount of time between 1:15 and 1:45 is the same as the amount of time between 10:45 and 11:15.
Ratio
The main difference to remember here is that ratio data has a meaningful zero point, unlike interval data. A gravestone that says a person lived from 1910-2005 is using interval data; an obituary that says a person was 95 years old is using ratio data. For example, you could measure your nightly sleep as interval data (sleeping from 10:45 to 7:10) or ratio data (slept 8 hours and 25 minutes)
Readings Associated with this Lesson
• Data + Design Chapter 2
• Designing Data Visualizations Chapter 1 (first half only; you can stop at “Exploration versus Explanation”)
Discussion
Let’s think about the various ways data already plays a role in your life. (This is related to the assignment, so it may be helpful to complete that first.)
Personal Data: What is one example of numerical data about your life that matters to you?
- When or why did you start caring about this?
- Is it something you make a point to track, or is it tracked/recorded automatically in some way?
Assignment
Scavenger Hunt: In this assignment, you’ll be finding examples of data and visualizations in your own life and daily activities.