Driver Analysis has a richer interactive flow than the simpler tools because it involves research-question framing, variable mapping, DV recoding, and model selection — all guided by the chat. Here is what the user experiences.
- 1
Ask the question
The user types something like:
- "What's driving NPS in our business banking segment?"
- "Which product features matter most to satisfaction?"
- "Why is churn higher this quarter?"
The platform recognises the intent and selects Driver Analysis.
- 2
State the research question
The chat asks the user to articulate the specific business question they're trying to answer. This question is threaded through the entire analysis and appears in the final report:
- "What drives NPS among our premium customers?"
- 3
Map the variables
The chat asks the user to identify:
- Dependent variable(s) — the outcome being explained (e.g. NPS score, overall satisfaction, intent to recommend)
- Independent variables — the attributes hypothesised to drive the outcome (e.g. product quality, price perception, customer service rating)
The platform suggests mappings based on column names and types. The user confirms or adjusts.
- 4
Review the analysis plan
The platform generates an analysis plan based on the data shape and variable types — including data quality checks and any transformations needed. The user reviews this plan in the chat and can adjust before proceeding.
- 5
Choose how to recode the dependent variable
For each DV, the chat presents recoding options:
- NPS scheme — auto-detected if the DV looks like a 0–10 NPS scale. Collapses into Detractor / Passive / Promoter and triggers the two-stage model.
- Top-2-box — collapses Likert ratings into binary (top 2 vs rest).
- Custom — user-defined binning.
- None — use raw values as-is.
The platform recommends a scheme based on the variable type; the user confirms or overrides.
- 6
Review the model plan
The platform recommends which models to fit based on the DV type and sample size (e.g. "RF + GBM + XGBoost + Logistic for this binary DV with n = 800"). The full menu spans tree ensembles (random forest, GBM, XGBoost, LightGBM, CatBoost), linear and logistic regression, PLS, MARS, Bayesian ridge / network, and a small neural net. The user reviews the model selection and can add or remove model types, and optionally turns on SHAP explanations for tree models.
- 7
Apply filters (optional)
- "Only respondents from the last 6 months"
- "Exclude incomplete surveys"
The platform validates sample size is sufficient for the chosen models.
- 8
Choose a subgroup (optional)
- "Split by customer tier"
- 9
Final confirmation
The chat presents a complete summary of everything configured:
- Research question
- Dependent and independent variables
- DV mapping scheme
- Model plan
- Filters and subgroups
- Expected sample size
- Weighting configuration (if set)
The user confirms, and the analysis runs.
- 10
Review results
Within a couple of minutes (depending on data size and number of models), the chat returns:
- Model comparison table — which model performed best and why
- Importance ranking — features ranked by importance (SHAP-based for tree models when SHAP is enabled, model-native otherwise), categorised as primary / secondary / low priority. Control variables are netted out so the client sees only the actionable drivers
- Quadrant map — importance vs. performance for prioritisation
- NPS two-stage breakdown (if NPS scheme was selected) — separate importance rankings for "what creates detractors" vs "what creates promoters"; a stage with too few respondents is skipped with a logged reason while the other still runs
- Calibration-impact view (if a calibration variable was supplied) — weighted vs unweighted ranking with a rank-change table and an AI-generated narrative
- Executive summary — AI-generated narrative tying findings to the research question
- Downloadable artefacts — markdown, PDF, PowerPoint, and a JSON metadata file
- 11
Follow up
- "What if we control for income?"
- "Run the same analysis but only for churned customers"
- "Which drivers are different for the under-35 segment?"
The user can refine the analysis iteratively without starting from scratch.