Home/Tools/Segmentation
Customer Typology

Segmentation — who are our customers really, and how should we organise around them?

Segmentation turns a survey into a small number of distinct, actionable respondent groups — and delivers a profiled, narrated report explaining each one. It is not a single clustering algorithm in a wrapper: an orchestrator profiles the data, fans out across seven methods and a full cluster-count sweep, scores every candidate solution, and hands the one you choose to a deep-research report microservice. The markdown report and the significance-tested Excel workbook are the two artefacts that traditionally take a research agency four to six weeks.

Seven-method orchestratorFull k-sweep · 2–10Significance-tested cross-tabsDeep-research reportMarketing-Truth pre-flight
Segment solution · Beverage typologyLive result
Six segments, six distinct signatures
Attitudinal profile — each cell indexed vs the sample average (100)
Method
KAMILA
Segments
6
Silhouette
0.31
Seg variables
12

The business question it answers

“Who are our customers really, and how should we organise our marketing, product and CX work around them?”

Say a beverage brand fields a 60-question study with 1,800 respondents. They feed in twelve attitudinal items — need-states, occasion preferences, brand-attitude statements — as segmentation variables, and twenty more — demographics, usage frequency, brand awareness, NPS — as profiling variables. The orchestrator screens the twelve for items where everyone already agrees, runs several clustering methods in parallel across k = 2..10, prunes the dominant-cluster solutions, and returns the top three. The user picks one — six segments, balanced sizes, silhouette 0.31, a KAMILA solution — and the chat hands back a multi-section markdown report with named personas, a significance-tested Excel workbook, and an executive summary slotted ahead of section one.

How the methodology works

Segmentation is the most multi-stage tool on the platform. The orchestrator types the variables, decides which clustering methods suit the data, runs them across a sweep of cluster counts in parallel, scores and prunes the resulting solutions, builds significance-tested cross-tab workbooks, and hands the chosen solution to a deep-research report microservice.

  • Seven methods, auto-selected. Six clustering algorithms — K-Means, Hierarchical, LCA, K-Prototypes, VarSelLCM and KAMILA — plus URF importance filtering. The orchestrator scores each method against the data profile and pairs it with a dimension-reduction option: none, PCA or URF.
  • Marketing-Truth pre-flight.Any segmentation variable where ≥ 60% of respondents already agree (or already disagree) is flagged before clustering — a typology built on a variable everyone agrees with is dead on arrival.
  • A full k-sweep, not a guess. Every k from 2 to 10 runs against every chosen method in parallel — a typical run produces 30–60 candidate solutions, instead of eyeballing a single silhouette plot.
  • Composite scoring and pruning.Each solution is scored on silhouette, Calinski-Harabasz and Davies-Bouldin, fused into a 0–5 score. Any solution where one segment holds ≥ 70% of the sample is discarded; the top three are returned for you to compare.
  • Significance-tested cross-tabs.Pairwise z-tests on every cell, ANOVA and pairwise Welch’s t-tests on numeric profiles, and automatic top/middle/bottom-box rows for Likert — in a “Corporate Blue” Excel workbook with a hyperlinked table of contents.
  • Evidence-bound persona narratives. The report is LLM-generated, but an over- or under-index is only called out when the index passes 130 / 70 and the cell is statistically significant — no agency-deck inflation.

What you see in the chat

The flow is multi-step with three explicit human-in-the-loop checkpoints — a Marketing-Truth review, a method-selection review, and weighting setup — before clustering runs, then a solution-selection step before the report is produced. You stay in control at every decision the analysis makes.

The deliverables are a multi-section deep-research report (Overview, Sample, Segment Building, Category Deep-Dives, Profile-Target, Market Insights, Next Steps and Appendix, with an executive summary stitched in at the top), a per-solution cross-tab workbook for every surviving solution, a comprehensive cross-tab workbook for the chosen solution, and persona narratives embedded one per segment. The report and the workbook are the two artefacts to call out — they are what traditionally takes a research agency four to six weeks.

Required data
DatasetSPSS .sav · CSV · Excel
Segmentation vars5–20 attitudinal items
Profiling varsDemographics · usage · NPS
Additional varsUp to 15, optional
Min sample50 hard · 500+ ideal
Use when you hear
  • We need to refresh our customer typology — the last one is five years old.
  • Marketing is targeting one big audience. We need to break it into groups.
  • Product wants to know which features matter to which kinds of user.
  • We have attitudinal data sitting in a tracker and no one's used it for segmentation.
  • Our agency quoted six weeks for a segmentation. Can you do this in days?
Disqualifier

Only want demographic groups?

Segmentation builds attitudinal and behavioural typologies. If you only need to split the sample by age, region or channel, that’s a descriptive job — start with Data Exploration.

See Data Exploration →
Step by step

From survey upload to narrated typology.

Segmentation is the most multi-step tool on the platform — three human-in-the-loop checkpoints sit between your data and the clustering run, so you steer every decision.

1
Ask the question
“Run a segmentation on our customer base.” The platform recognises the intent and selects Segmentation.
2
State the research question
Articulate the business question. It threads through the persona narratives, the Profile-Target section and the executive summary.
3
Map the variables
Identify segmentation variables, profiling variables and up to 15 optional additional variables. Mappings are suggested from column names and SPSS metadata.
4
Marketing-Truth review
The chat flags any segmentation variable where ≥ 60% of respondents already agree or disagree. You decide which to keep or drop. (Checkpoint 1.)
5
Method & weighting setup
Review the ranked method × dimension-reduction recommendations and override if you like; supply population targets for RIM weighting. (Checkpoints 2 & 3.)
6
Sweep & pick a solution
Every k from 2 to 10 runs across every chosen method in parallel. The chat returns the top three solutions with scores, sizes and traits — you pick one.
7
Review the deliverables
A deep-research report with an executive summary, per-solution and comprehensive cross-tab workbooks, and persona narratives — refine or branch into the Typing Tool.
Compared to

How Crowdmines compares to SPSS, specialist tools and consulting agencies.

Building a strategic customer segmentation has historically been one of the slowest and most expensive deliverables in market research — manual statistical work in SPSS, R or Mplus; point-and-click SaaS tools; or a four-to-eight-week agency engagement costed at $30K–$150K.

CapabilityTraditional (SPSS / R / Mplus)SaaS tools (Displayr, Q, Sawtooth)Agency / consultingCrowdmines
Setup effortWrite clustering scripts, pre-process by hand, hand-pick variablesPoint-and-click, but still requires manual variable typing and method choiceBrief the agency, send data, waitAsk in the chat — variable typing, method choice and the k-sweep all automated
Clustering methodsOne per run; analyst picks whichUsually one or twoAgency picks one (often K-Means or LCA)Seven methods, auto-selected, plus URF and PCA dimension reduction
Mixed-type dataManual encoding, often poorly handledTypically forces a single variable typeSome agencies handle it; many don'tNative — Gower distance, K-Prototypes, KAMILA and VarSelLCM
k selection & validationAnalyst eyeballs a scree or silhouette plot; codes metrics by handSome auto-selection; silhouette and a few extrasAgency picks k and reports metrics in the deckFull k = 2–10 sweep; every solution scored on silhouette, Calinski-Harabasz and Davies-Bouldin, fused into a 0–5 score
Profiling & sig-testingManual cross-tabs and chi-square / ANOVA in SPSS or ExcelBuilt-in cross-tabs; significance testing variesStandard agency deliverablePairwise z-tests on every cell, ANOVA and Welch's t-tests, box-score rows
Deep-research reportAnalyst hand-writes — days of workLimited templated outputForty-page deck delivered weeks laterEight-section markdown report with an executive summary, auto-generated
TurnaroundDays to weeksHours to days4–8 weeksMinutes to cluster, minutes to report
Cost per analysisAnalyst time (days to weeks)Software licence + analyst time$30K–$150K per engagementPlatform subscription — unlimited runs
Beta Program Open

Refresh your typology in days, not six weeks.

Built for insights, brand and marketing teams refreshing or building a strategic segmentation. Seven methods, a full k-sweep, significance-tested cross-tabs and a narrated deep-research report — all inside a chat.