In qualitative research, data saturation is the point at which you’ve collected enough data to glean the insights you’re looking for.
Think of data saturation as a spectrum. At the start of the spectrum, you’ve got just enough data to begin seeing core themes emerge. You can start drawing conclusions here if you need to, but more data will still make those insights richer.
At the end of the spectrum is peak data saturation. If you keep collecting data after this point, you’ll see the same sorts of results over and over. You won’t gain any new insights from them.
Knowing where your data needs fall on the spectrum can be tricky. It really depends on what type of project you’re doing. If it’s a research project for a business, you don’t need much. If you’re doing scientific research and need to achieve statistical significance, that’s a whole ‘nother ball game.
Each qualitative collection method has a general rule for how much data is needed in order to hit data saturation.
Customer Interviews: Collect At Least 5, Reach Saturation at 10
Customer development is an entire field in business. It’s the practice of interviewing customers to discover their pains and needs.
When you understand your customers on a deep level like this, you can deliver the products or services they crave. The more useful and meaningful your offerings are, the better your business will do.
So how many customer interviews should you conduct before you sit down and analyze the results?
Within the field, it’s commonly recognized that you’ll start to see core themes emerge with as few as 5 interviews. This feels really small, but it’s 100% true in my experience.
You can see a micro-example of this for yourself if you head to any review aggregator. On Yelp, I randomly searched for oil change shops in Los Angeles. I clicked on a shop with nearly perfect ratings.
Every single one of the first five reviews mentions how friendly, approachable, and helpful the shop’s owner is. In fact, they all mention him by name. Some even provide personal anecdotes on the qualities that really make the shop owner shine.
The first five “most relevant” Google reviews on a Walmart in Wasilla, Alaska reveal customers are frustrated with the same things: a lack of consistently open self-checkout lanes, low item stocks, and uneven customer service experiences.
And these are just reviews. Interviews work the exact same way.
If you decide to hold more than five interviews, you’ll still get new information. Once you hit 10, though, you’ll stop being surprised by anything the customers say. So save your time and host between 5 and 10 interviews.
The one exception to this is if your initial batch of 10 reviews is highly biased. Like if you go to all your friends and family. To get a strong signal from 5-10 interviews, those people need to be strong representatives of the ground you’re trying to get insights from.
Focus Groups: Hold at Least 4, Reach Saturation at 8
Focus groups function a lot like customer interviews. You’ll spot core themes pretty fast. And because you’re dealing with groups of people, you can cut down the volume a bit more. People are often quick to either agree or disagree with discussion points their peers bring up in a focus group.
Here’s a silly example to clarify what I mean. Say you’re a business that sells fresh, refrigerated salsa in grocery stores—basically, the opposite of Pace and other canned stuff. You run a focus group to learn more about whether your salsa tastes like something your participants would eat at a fabulous, hole-in-the-wall Mexican restaurant.
Of the eight participants, seven say they can taste the delightful mingling of flavors: cilantro, lime juice, tomatoes, red onions, jalapeños, and cumin.
One participant spits the salsa out and says it has so much cilantro it tastes like dish soap.
Since the seven other participants love the salsa and can identify the distinct flavors, you can safely assume your outlier of a participant has a genetic predisposition to perceive cilantro as tasting “like soap, mold, dirt or bugs.”
This is how Oxford scholar Charles Spence describes the flavor for those unfortunate souls in “Coriander (cilantro): a most divisive herb,” a piece published in a 2023 issue of the International Journal of Gastronomy and Science.
Basically, if your outlier hates cilantro that much, they almost certainly have one of the two “cilantro taste gene” variants that affect anywhere between 3 and 21% of the population, depending on ethnicity. So you’d want to pay more attention to the other seven respondents for your focus group—and maybe screen out cilantro haters next time.
Run at least four more focus groups with different participants and you’ll know whether your salsa is actually good or if that first group was an anomaly. Run four more and you’ll have all the data you need on the matter.
Observation: Do at Least 5, Reach Saturation at 10
Observations are different from other qualitative data collection methods in that you don’t ask for feedback from anyone. Instead, your job is to blend into the background and observe the behaviors before you.
If you want to gather observational data, you’ll want to replicate the observation at least five times, but not more than 10.
So, let’s say you want to learn whether kids in a third-grade classroom focus better with classical music playing in the background—or not. By the time you’ve observed five different third-grade classrooms, you’ll probably start to notice consistent patterns in the student’s behavior.
You might observe that some children seem to focus more easily during quiet activities like reading with the dulcet tones of Vivaldi playing in the background. But others appear unaffected or even distracted by the music. These initial five sessions will give you a solid baseline to understand how classical music impacts classroom focus.
But to make sure you’ve captured a wide enough range of responses, you should aim for a total of around 10 observations. By the 10th session, any additional classroom observations will repeat the same trends you’ve already identified.
You’ll have enough data to confidently draw conclusions about how classical music influences student focus. You won’t need to spend any more time playing Mozart and Beethoven to a group of curious kids.
While this may (or may not) sound like fun, it’ll just pile extra work onto your lap without giving you fresh insights in return.
Surveys: Collect At Least 30, Reach Saturation at 100
Surveys are a lot like interviews, but because they’re static, there’s no probing deeper if you have questions. Because of this, you need at least 30 survey responses to hit that baseline level on the saturation spectrum.
You’ll definitely see key patterns emerge after 30 responses. By 100, though, you’ll stop seeing anything new. You might be tempted to keep going—especially if you’ve collected really insightful and valuable data so far—but that just forces you to do more work.
Imagine you’re interviewing patients about their healthcare experiences. By the 30th interview, you’ll start hearing similar stories about bedside manner, wait times, or treatment satisfaction.
By the time you reach 100 responses, you’ll usually have captured the full range of experiences. Additional data won’t tell you anything you haven’t already learned.
That said, there is an exception here. In cases where you’re using advanced survey techniques like conjoint analysis or discrete choice modeling, you might need way more than 100 responses.
These advanced survey methods are designed to identify the teeniest differences in preference. And that process takes a huge volume of responses.
Say you’re surveying customers about the features they’d want in a new smartphone. You ask them to compare various configurations of features—things like screen size, battery life, and camera quality.
You’ll need a huge sample—probably 500 or even 1,000 respondents—to reliably determine which combinations are most desirable.
Why Data Saturation is SO Important in Qualitative Research
Qualitative research takes a TON of time. It’s not easy. From picking data collection methods and organizing interviews (or surveys, observations, or focus groups) to data coding and analysis, it all takes forever to do. Even if you have tools to help you go faster.
Any additional time spent gathering data beyond the saturation point is basically wasted. There’s a real opportunity cost involved, too: you’re investing precious time in a research project that won’t yield any new insights.
Instead of focusing on running just one more survey or interview or focus group, move on. Shift gears to analysis. Start making meaningful recommendations based on what you found.
This is what will ultimately drive change.
Infinitely hosting focus groups for salsa lovers won’t drive sales or land you in the most popular grocery stores. Using the data gleaned from your four to eight focus groups to persuade higher-ups will. Interviewing customers ad nauseam won’t move your product forward in development. Using the data gathered in five to 10 interviews to refine product features will.
Et cetera.
So don’t let yourself fall prey to tunnel vision. Trust me, I’ve been there before. I know how satisfying data collection can be.
There’s nothing like gathering information from a bunch of different people and knowing it’s original, primary research that’s going to benefit your organization. It’s thrilling, it’s satisfying, it probably releases all sorts of happy chemicals in your brain.
But once you’ve got what you need, it’s time to move forward.
Because the real value that research provides is the insights and recommendations that can lead to change. Endless data collection delays that.
Keep an eye on data saturation and you’ll get just the right amount of qualitative data.