Home/Tools/Data Exploration
Everyday utility

Data Exploration — descriptive analytics on demand.

The platform’s “show me what’s in this variable” tool. Pick one or more columns and get back the appropriate descriptive output for each — means and histograms for continuous variables, counts and frequency tables for categorical ones. The right summary is chosen automatically.

Frequency tablesHistogramsSummary statsFilter-awareType-aware (SPSS metadata)
Profile · Age distributionLive result
What does the age distribution look like?
Continuous variable · histogram with KDE overlay
Mean
38
Median
36
Std dev
14
n

The business question it answers

“Before we model anything, what does the data actually look like?” — and just as often, “Quickly profile this variable for me.”

Three common ways to use it:

  • Sanity check at kickoff. “What’s the response rate by region? How many people answered the brand-awareness question? Is age skewed young?” Catches data-quality issues before any modelling effort is wasted on them.
  • Filter narration. “Of the under-35 respondents, what does the income distribution look like?” The exploration runs against your active chat filter, so you can see exactly who you’re talking about before drawing conclusions.
  • Stakeholder asks. “The CMO wants the top 3 reasons people gave for not subscribing — pull the frequency table.” The chart and table land in chat in seconds.

How the methodology works

Data Exploration doesn’t apply a one-size-fits-all summary. It detects each column’s measurement level — using SPSS variable_measure metadata embedded in the source file when available, or falling back to pandas dtype inference — and selects the right summary accordingly.

  • Continuous / scale variables. Count, mean, standard deviation, the five-number summary (min, 25th, median, 75th, max) and a histogram with KDE overlay.
  • Categorical variables (nominal / ordinal). Count, unique-count, top 10 value frequencies, mode and a bar chart of category frequencies.
  • Filter-aware. When you apply filters in the chat, they are applied to the dataset before the summary is computed — statistics and charts reflect only the filtered population.

This matters because computing a mean on a coded categorical variable (e.g. 1 = Male, 2 = Female) produces a meaningless number. The type-aware approach avoids this class of error automatically.

What it does not do

  • No modelling.Purely descriptive. It doesn’t fit regressions, run significance tests, or compute correlations.
  • No subgroup fan-out. To compare groups, ask twice with different filters, or use a modelling tool (Driver Analysis, Segmentation) when the question is comparative.
  • No minimum sample size. Because no model is being fit, even small slices are reported. The n-count is always shown so you can judge reliability.
Required data
DatasetAny cleaned dataset uploaded to the project
SelectionOne or more columns to profile
FilterOptional, chat-driven
Min sampleNone enforced
Use when you hear
  • My team currently does this in Excel and it takes hours.
  • We want a tool people can self-serve from, not a report we get emailed.
  • Researchers want to slice the data themselves before commissioning the heavy analysis.
Included with every project

Comes with the platform.

Data Exploration is built into every Crowdmines project. Run a pricing study, a driver analysis or a segmentation, and you can profile any variable in the same chat — no separate setup, no extra licence.

Step by step

From a question to a chart in six steps.

Here's what you experience in the chat, from start to finish.

1
Ask the question
“What does the age distribution look like?” or “Profile household income and education level.”
2
Map the columns
Confirm one or several columns to profile. The platform maps natural-language descriptions to actual column names.
3
Apply filters
Optional. e.g. “Only women”, “Only respondents from 2024”, “Only people who saw the new packaging”.
4
Confirm and run
Final summary: which columns will be profiled, active filters, expected sample size.
5
Review results
For each column: the right summary stats for the variable type, a histogram or bar chart, and a metadata note (variable type, sample size, missing-value count).
6
Follow up
“Now show me that same variable but only for respondents under 35,” or pivot to a modelling tool in the same chat.
Compared to

How Crowdmines compares to Excel pivot tables, SPSS Frequencies and survey dashboards.

Pulling frequency tables, histograms and summary statistics is the most common research task and the one most often done with the oldest tools — Excel pivot tables, SPSS Frequencies, R’s summary(), or the built-in dashboards in survey platforms. It’s not glamorous, but it’s where every project starts and where a surprising amount of analyst time gets burned.

CapabilityTraditional (Excel / SPSS)Survey-platform dashboardsCrowdmines
Setup effortOpen the file, navigate menus or write syntax, pick variables one at a timeLog into the survey platform, navigate to the reporting tabType “show me the distribution of X” in the chat
Variable-type awarenessAnalyst must know not to compute a mean on a categorical variablePlatforms usually get this right for their own dataAutomatic — reads SPSS metadata or infers from dtype
FilteringManually filter in the data view or add syntaxPoint-and-click filter builderNatural language — “only women under 35”
Multiple variables at onceRun the command/menu per variable, combine outputs manuallySome platforms support multi-variable views“Profile Q5, Q7, and Q12” — all returned together
Transition to modellingClose this tool, open a different one, re-import dataLimited to what the platform offersSame chat — “now run a driver analysis on NPS”
Analyst dependencyRequires someone who knows the tool (SPSS syntax, Excel pivots)Researcher can self-serve but only within the platform's UIAnyone who can describe the question
SpeedMinutes per variable (menu navigation, waiting for SPSS to compute)Seconds within the platformSeconds, for any number of variables
Beta Program Open

The workhorse every project starts with — and keeps returning to.

Frequencies, histograms, summary stats. Filter as conversation. The right summary for the right variable, every time — no SPSS syntax, no Excel pivots, no analyst ticket queue.