Home/Tools/Typing Tool
Segmentation

Typing Tool — turn a segmentation into a working classifier.

You already have segments. The Typing Tool finds the shortest subset of questions that still reliably places a new respondent into the right segment, then ships it as an Excel typing-tool workbook the contact centre can use that afternoon, or a packaged model bundle for teams embedding it in their own systems.

Excel workbook or model bundle9-rung feature ladderCross-validated accuracyCalibration scoring (ECE)Triple-metric screening
Feature-set ladder · financial-services typologyLive result
Accuracy vs. questionnaire length
Six questions hit 84% accuracy — the recommended trade-off
Recommended
FS-6
Accuracy
84%
12-q ceiling
89%
Output
XLSX

The business question it answers

“We invested in a segmentation. How do we keep using it without re-running the full study every time we meet a new customer?”

Say you’ve built a six-segment customer typology from a 50-question study with 4,000 respondents. Your CRM and contact-centre teams want to assign new prospects to those segments in real time — but no one is going to ask a new lead 50 questions. Typing Tool screens the 50, tests reduced sets, and reports: six questions get to 84% classification accuracy; ten questions get to 89%. You pick six, and the chat hands back an Excel workbook the contact centre can use that afternoon.

Two phases inside the same chat

Phase 1 — Evaluate. First you pick the output format — an Excel workbook for teams who score new respondents in a spreadsheet, or a model bundle for engineers embedding the classifier in their own systems. The platform then screens the candidate questions and tests a ladder of every size from 4 to 12 features — nine rungs.

  • The Excel path tests two interpretable classifiers — logistic regression and LDA (linear discriminant analysis) — for every rung; the model-bundle path tests up to fifteen classifier families and pickles the best.
  • Every combination is scored on cross-validated accuracy, macro-F1 (a balanced precision/recall measure) and calibration(how well the model’s confidence matches its hit rate, via Brier score and Expected Calibration Error).
  • The chat returns a ranked ladder and auto-suggests the smallest rung that clears all three quality gates — accuracy ≥ 75%, macro-F1 ≥ 70%, ECE ≤ 1.00 — with logistic preferred over LDA on ties.

Phase 2 — Finalise. You pick a rung — typically a trade-off between length and accuracy. The platform fits the final classifier on the full dataset and exports the deliverable for the path you chose: an Excel typing-tool workbook (plus the feature-set ladder CSV), or a pickled model bundlewith a generated README and loader script. Drop in a new respondent’s answers, get back their predicted segment.

How feature screening works

The screening step reduces a large pool of candidate predictor questions (often 30–50+) down to a shortlist of the most segment-discriminating items. The platform fuses three independent scoring methods:

  • ANOVA F-statistic— how much each predictor’s mean differs across segments (linear separation).
  • Mutual Information — non-linear dependency between predictor and segment label.
  • SHAP importance — game-theoretic contribution from a tree-based classifier (interaction effects).

Bootstrap stability (30 iterations of MI scoring) guards against features that score high by chance. Correlation pruningat r > 0.90 prevents two highly correlated questions from both surviving the shortlist.

Required data
BaselineLabelled segmentation dataset
CandidatesPool of predictor questions
Min CV accuracy75% (default)
Min macro-F10.70 (default)
Min sampleA few hundred labelled respondents
Use when you hear
  • We have segments — how do we keep classifying new customers cheaply?
  • We can’t run the full segmentation survey on every lead.
  • We want the contact centre / sales team / website to know which segment a person belongs to.
  • We did a segmentation last year — what’s the lightweight version?
  • How do we make the segmentation operational, not just a deck?
Disqualifier

No existing segmentation?

Start there first — Crowdmines’s Segmentation service builds the original typology; Typing Tool only operationalises one that already exists.

See Segmentation →
Step by step

From 50 questions to a working Excel workbook.

Two phases inside the same chat session — evaluate, then finalise.

1
Ask the question
“Build a typing tool for our customer segments” — the platform recognises the intent and selects the Typing Tool.
2
Map the variables
Identify the segment-label column and the pool of candidate predictor questions.
3
Pick the output format
Excel workbook (formula-native, no Python) or model bundle (a pickled classifier) — this sets the classifier catalog. Optional filters too.
4
Phase 1 — Evaluate
Triple-metric screening, then a ladder of every size from 4 to 12 features. The Excel path tests logistic + LDA — 18 configurations — cross-validated and ranked.
5
Pick a rung
The platform auto-suggests the smallest rung clearing all three gates — accuracy ≥ 75%, macro-F1 ≥ 70%, ECE ≤ 1.00.
6
Phase 2 — Finalise
Final model fit on the full dataset; scenario predictions for typical Low / Mid / High respondent profiles.
7
Download the deliverables
Excel path: typing-tool workbook + ladder CSV + report. Model-bundle path: pickled model + README + loader script + CSV + report.
Compared to

How Crowdmines compares to R / Python / SPSS and agency engagements.

Building a typing tool — a short-form classifier that assigns new respondents to existing segments — is traditionally a specialist task performed by a data scientist or research statistician using R, Python or SPSS’s discriminant analysis module. Some research agencies offer it as a service deliverable, typically scoped at 1–3 weeks and priced as a standalone engagement.

CapabilityTraditional (R / Python / SPSS)Agency serviceCrowdmines
Setup effortWrite feature-selection code, set up CV pipelines, tune classifiers, export manuallyBrief the agency, send data, wait for deliveryAsk in the chat — variables mapped, screening automated
Feature screeningManual: analyst picks candidate features using domain knowledge or stepwise selectionAgency uses their preferred method (varies)Triple-metric screening (ANOVA + Mutual Information + SHAP) with bootstrap stability and correlation pruning
Feature ladderAnalyst tests a few sizes manually, compares accuracyAgency tests a few configurationsSystematic ladder — every size from 4 to 12 features — × 2 classifiers on the Excel path = 18 configurations tested and ranked automatically
Classifier typesAnalyst picks one (usually logistic)Agency picks oneExcel path: logistic + LDA for every rung. Model-bundle path: up to fifteen classifier families — ridge / lasso / elasticnet logistic, decision tree, random forest, SVC, naive Bayes, Gaussian process, MLP, and XGBoost / LightGBM / CatBoost when installed
Output pathsSingle bespoke deliverable per projectSingle deliverableTwo paths picked up-front — Excel workbook (formula-native, no Python) or model bundle (joblib pickle + README + loader script)
Evaluation metricsAnalyst computes accuracy; F1 and calibration often skippedAgency reports accuracy (sometimes)Accuracy, macro-F1, Brier score and ECE — all cross-validated with standard deviations
Auto-selectionAnalyst picks manuallyAgency recommendsPlatform suggests the best rung with transparent trade-off reasoning
Operational deliverableAnalyst builds a custom Excel workbook or scoring functionAgency delivers an Excel workbook (their format)Standardised Excel typing-tool workbook + trained model pickle + evaluation CSV — all auto-generated
TurnaroundDays (analyst time)1–3 weeksMinutes
Beta Program Open

Six questions, 84% accuracy — an Excel workbook the contact centre can use that afternoon.

Triple-metric feature screening, cross-validated accuracy & calibration, transparent accuracy-vs-length trade-off. Ship the operational artefact in minutes, not weeks.