Home/Tools/Segmentation

Customer Typology

Segmentation — who are our customers really, and how should we organise around them?

Segmentation turns a survey into a small number of distinct, actionable respondent groups — and delivers a profiled, narrated report explaining each one. It is not a single clustering algorithm in a wrapper: an orchestrator profiles the data, fans out across seven methods and a full cluster-count sweep, scores every candidate solution, and hands the one you choose to a deep-research report microservice. The markdown report and the significance-tested Excel workbook are the two artefacts that traditionally take a research agency four to six weeks.

Seven-method orchestratorFull k-sweep · 2–10Significance-tested cross-tabsDeep-research reportMarketing-Truth pre-flight

Segment solution · Beverage typologyLive result

Six segments, six distinct signatures

Attitudinal profile — each cell indexed vs the sample average (100)

Method

KAMILA

Segments

Silhouette

0.31

Seg variables

The business question it answers

“Who are our customers really, and how should we organise our marketing, product and CX work around them?”

Say a beverage brand fields a 60-question study with 1,800 respondents. They feed in twelve attitudinal items — need-states, occasion preferences, brand-attitude statements — as segmentation variables, and twenty more — demographics, usage frequency, brand awareness, NPS — as profiling variables. The orchestrator screens the twelve for items where everyone already agrees, runs several clustering methods in parallel across k = 2..10, prunes the dominant-cluster solutions, and returns the top three. The user picks one — six segments, balanced sizes, silhouette 0.31, a KAMILA solution — and the chat hands back a multi-section markdown report with named personas, a significance-tested Excel workbook, and an executive summary slotted ahead of section one.

How the methodology works

Segmentation is the most multi-stage tool on the platform. The orchestrator types the variables, decides which clustering methods suit the data, runs them across a sweep of cluster counts in parallel, scores and prunes the resulting solutions, builds significance-tested cross-tab workbooks, and hands the chosen solution to a deep-research report microservice.

Seven methods, auto-selected. Six clustering algorithms — K-Means, Hierarchical, LCA, K-Prototypes, VarSelLCM and KAMILA — plus URF importance filtering. The orchestrator scores each method against the data profile and pairs it with a dimension-reduction option: none, PCA or URF.
Marketing-Truth pre-flight.Any segmentation variable where ≥ 60% of respondents already agree (or already disagree) is flagged before clustering — a typology built on a variable everyone agrees with is dead on arrival.
A full k-sweep, not a guess. Every k from 2 to 10 runs against every chosen method in parallel — a typical run produces 30–60 candidate solutions, instead of eyeballing a single silhouette plot.
Composite scoring and pruning.Each solution is scored on silhouette, Calinski-Harabasz and Davies-Bouldin, fused into a 0–5 score. Any solution where one segment holds ≥ 70% of the sample is discarded; the top three are returned for you to compare.
Significance-tested cross-tabs.Pairwise z-tests on every cell, ANOVA and pairwise Welch’s t-tests on numeric profiles, and automatic top/middle/bottom-box rows for Likert — in a “Corporate Blue” Excel workbook with a hyperlinked table of contents.
Evidence-bound persona narratives. The report is LLM-generated, but an over- or under-index is only called out when the index passes 130 / 70 and the cell is statistically significant — no agency-deck inflation.

What you see in the chat

The flow is multi-step with three explicit human-in-the-loop checkpoints — a Marketing-Truth review, a method-selection review, and weighting setup — before clustering runs, then a solution-selection step before the report is produced. You stay in control at every decision the analysis makes.

The deliverables are a multi-section deep-research report (Overview, Sample, Segment Building, Category Deep-Dives, Profile-Target, Market Insights, Next Steps and Appendix, with an executive summary stitched in at the top), a per-solution cross-tab workbook for every surviving solution, a comprehensive cross-tab workbook for the chosen solution, and persona narratives embedded one per segment. The report and the workbook are the two artefacts to call out — they are what traditionally takes a research agency four to six weeks.

Required data

DatasetSPSS .sav · CSV · Excel

Segmentation vars5–20 attitudinal items

Profiling varsDemographics · usage · NPS

Additional varsUp to 15, optional

Min sample50 hard · 500+ ideal

Use when you hear

We need to refresh our customer typology — the last one is five years old.
Marketing is targeting one big audience. We need to break it into groups.
Product wants to know which features matter to which kinds of user.
We have attitudinal data sitting in a tracker and no one's used it for segmentation.
Our agency quoted six weeks for a segmentation. Can you do this in days?

Disqualifier

Only want demographic groups?

Segmentation builds attitudinal and behavioural typologies. If you only need to split the sample by age, region or channel, that’s a descriptive job — start with Data Exploration.

See Data Exploration →

Step by step

From survey upload to narrated typology.

Segmentation is the most multi-step tool on the platform — three human-in-the-loop checkpoints sit between your data and the clustering run, so you steer every decision.

Ask the question

“Run a segmentation on our customer base.” The platform recognises the intent and selects Segmentation.

State the research question

Articulate the business question. It threads through the persona narratives, the Profile-Target section and the executive summary.

Map the variables

Identify segmentation variables, profiling variables and up to 15 optional additional variables. Mappings are suggested from column names and SPSS metadata.

Marketing-Truth review

The chat flags any segmentation variable where ≥ 60% of respondents already agree or disagree. You decide which to keep or drop. (Checkpoint 1.)

Method & weighting setup

Review the ranked method × dimension-reduction recommendations and override if you like; supply population targets for RIM weighting. (Checkpoints 2 & 3.)

Sweep & pick a solution

Every k from 2 to 10 runs across every chosen method in parallel. The chat returns the top three solutions with scores, sizes and traits — you pick one.

Review the deliverables

A deep-research report with an executive summary, per-solution and comprehensive cross-tab workbooks, and persona narratives — refine or branch into the Typing Tool.

Read the full how-to

Compared to

How Crowdmines compares to SPSS, specialist tools and consulting agencies.

Building a strategic customer segmentation has historically been one of the slowest and most expensive deliverables in market research — manual statistical work in SPSS, R or Mplus; point-and-click SaaS tools; or a four-to-eight-week agency engagement costed at $30K–$150K.

Capability	Traditional (SPSS / R / Mplus)	SaaS tools (Displayr, Q, Sawtooth)	Agency / consulting	Crowdmines
Setup effort	Write clustering scripts, pre-process by hand, hand-pick variables	Point-and-click, but still requires manual variable typing and method choice	Brief the agency, send data, wait	Ask in the chat — variable typing, method choice and the k-sweep all automated
Clustering methods	One per run; analyst picks which	Usually one or two	Agency picks one (often K-Means or LCA)	Seven methods, auto-selected, plus URF and PCA dimension reduction
Mixed-type data	Manual encoding, often poorly handled	Typically forces a single variable type	Some agencies handle it; many don't	Native — Gower distance, K-Prototypes, KAMILA and VarSelLCM
k selection & validation	Analyst eyeballs a scree or silhouette plot; codes metrics by hand	Some auto-selection; silhouette and a few extras	Agency picks k and reports metrics in the deck	Full k = 2–10 sweep; every solution scored on silhouette, Calinski-Harabasz and Davies-Bouldin, fused into a 0–5 score
Profiling & sig-testing	Manual cross-tabs and chi-square / ANOVA in SPSS or Excel	Built-in cross-tabs; significance testing varies	Standard agency deliverable	Pairwise z-tests on every cell, ANOVA and Welch's t-tests, box-score rows
Deep-research report	Analyst hand-writes — days of work	Limited templated output	Forty-page deck delivered weeks later	Eight-section markdown report with an executive summary, auto-generated
Turnaround	Days to weeks	Hours to days	4–8 weeks	Minutes to cluster, minutes to report
Cost per analysis	Analyst time (days to weeks)	Software licence + analyst time	$30K–$150K per engagement	Platform subscription — unlimited runs

Beta Program Open

Refresh your typology in days, not six weeks.

Built for insights, brand and marketing teams refreshing or building a strategic segmentation. Seven methods, a full k-sweep, significance-tested cross-tabs and a narrated deep-research report — all inside a chat.