Open-ended questions are great for getting authentic feedback because they give people a chance to describe what they’re experiencing in their own voice. Analyzing such survey questions yourself is an excellent opportunity to empathize with your audience, gather essential insights, and make the right decisions.
But how do you efficiently analyze more than 100 replies? Or even 1,000?
Here’s a system that Hotjar uses to categorize and visually represent large volumes of qualitative data—it’s easier than you might think! You’ll have to work with the technique a bit before you become comfortable with it, but once you get it, you’ll be sorting through mountains of qualitative data in no time.
What you’ll need:
- Working knowledge of spreadsheets (Google Sheets or Excel)
- A quiet space with some uninterrupted focus time
- Hotjar’s Open-ended question analysis template
The open-ended question analysis template
Step 1: get your data into the template
1) Export the data from your survey or poll into a .CSV or .XLS file.
2) Copy the data from your .CSV or .XLS file and paste it into the sheet ‘CSV Export’ of the template.
🏆 Pro tip: use ‘Paste special’ to paste ‘Values Only’ in the Hotjar analysis template, so no formulas or formatting are copied over.
This is what your data should you like after being copy-pasted in the ‘CSV Export’ sheet
3) Copy the column from the ‘CSV Export’ sheet containing the open-ended question you want to analyze first and paste it into the ‘Question 1’ sheet, in the cell marked with < Paste answers to first open-ended question here >.
4) Choose wrap text for the entire column, so the data fits the column width and is easier for you to read later on.
Step 2: identify response categories
A response category is a set of replies that can be grouped because they are part of the same theme, even if they’re worded differently.
In the sample dataset for this tutorial, Hotjar asked their customers to explain how their employer measures their performance (e.g., revenue, conversions, traffic). In theory, you could go through every answer to identify your response categories one-by-one, but that wouldn’t be very efficient. Instead, use a series of techniques to help you identify the broad categories:
A) Use a text analyzer: text analyzers take your data and analyze it for the most commonly used words in your text, which helps you identify broad categories of responses.
🏆 Pro tip: Textalyser is a simple, free resource that does this well.
Copy and paste your data into Textalyser and click ‘Analyze the text’
If you do this with the sample provided above, you’ll find that ‘sales,’ ‘conversion,’ and ‘traffic’ are some of the most commonly used words in the data set:
As such, they represent some of the most popular replies to the question asked. They don’t represent all the answers, of course, but they’re a good place to start when building the list of response categories.
Add each category to the top of separate a separate column (replacing the text that reads, ‘Response Category 01,’ ‘Response Category 02,’ etc.):
Note: some of the popular words in the text analyzer mean the same thing (e.g., ‘sales’ and ‘revenue’), so you’ll want to create a single category for those responses called ‘Sales/Revenue.’ Other popular words will NOT become categories because, as stand-alone words, they tell nothing useful (e.g., ‘our,’ ‘rate’).
B) Sort your responses alphabetically: when you sort alphabetically, you’ll notice that specific patterns emerge, and you can create more categories based on the trends you spot.
Scan the alphabetically sorted responses for other categories, such as ‘It’s not measured,’ ‘Traffic,’ ‘Conversions,’ etc. Be on the lookout for synonyms, but don’t worry if you create a few redundant categories for now. You will combine the categories that mean the same thing at the end.
Step 3: record the individual responses
1) Place a ‘1’ in each cell where a response (the row) matches a category (the column) to identify a positive response in each category. Add categories as you go.
For example, if you sorted the sample data alphabetically, you’ll find that the response in Row 6 reads, ‘Huh?’ If you added ‘Did not understand the question’ to Column E (see screenshot), then you’ll place a ‘1’ in E36.
Note: In the example, many respondents indicate that their performance was measured by multiple factors (e.g., lead gen + sales + customer satisfaction). Be sure to place a ‘1’ in each category. In other words, the row for that single answer, ‘Revenue, then conversion rate, then traffic.’ will record three different positive responses.
When you input your first ‘1,’ the cell in Row 3 (below the category) will change to indicate the number of positive responses in that category. Row 4 will change from a ‘#DIV/0’ error to the percentage of responses that fall into each category.
2) Use the ‘Find’ feature to search for words related to each category: begin with the first category (‘sales’) and search the data column for any response that mentions ‘sales.’ Read the entire response to ensure it fits the category you searched for, then place a ‘1’ in the appropriate column for that response.
3) Fill in the gaps: read each row that hasn’t been categorized and place a ‘1’ under the appropriate category, creating new categories as necessary. As you create new categories, search your data for those terms to quickly find similar responses.
⚠️ Important: when adding a new category as you go through the responses, make sure to retroactively check previous answers that might fit in this new category.
Step 4: organize your categories
1) Group your data: you will almost certainly find categories that should be grouped but ended up in different categories because respondents used different words to describe the same concept. In the sample data, the terms ‘Lead Gen’ and ‘Form Submissions’ belong in the same category.
Drag these columns next to each other, and apply a color to the group of columns you plan to merge—this marks them as a group so you can return to them in a bit when it’s time to combine them. Repeat this step for each set of categories you plan to join.
2) Arrange your categories from large to small: arrange your categories in descending order from left to right. For those that only contribute to a small percentage of the total (2% or less), use the grouping method above to merge them into one category called ‘Others,’ which you’ll leave on the far right.
Step 5: represent your data visually
1) Prep your data to create a bar chart. First, select and copy the top three rows of your spreadsheet (those that make up the ‘Response Categories,’ ‘Total respondents who answered X,’ and ‘% respondents who answered X’).
Paste them into the ‘Graph Question 1’ sheet using the ‘Paste special’ feature to paste only the values (so the formulas don’t copy over).
Select and copy the table you just pasted, and choose ‘Paste special’ again—this time using ‘Paste transposed’ to invert the rows and columns (this makes your data more chart-friendly).
This is what you should see:
Your table containing categories, the volume of responses, and percentage should you like the above
2) Create your chart: insert your chart, selecting the percentage column as your ‘Series’ and the categories as your ‘X-axis.’ Resize the chart however you see fit.
Your open-ended answers are now visualized in a graph
And there you have it—a visual representation of your data! Feel free to experiment with different formats if you’re putting the chart into a formal presentation.
Analyzing open-ended questions efficiently and empathizing with your audience takes some practice, but the more you do it, the easier it becomes. Your mind will begin to recognize patterns the more you practice this technique, so don’t be afraid to dive into it.
PS. This exact process is time-consuming not just to analyze but also in terms of everything that precedes it: coming up with questions, creating a survey, segmenting the email list, sending out the survey. However, it has always been one of the most valuable things I have done for each project. The learnings, ideas and deeper understanding of customers are invaluable.