Logistic Regression Tool

New

Fit and interpret logistic regression models for marketing data with a binary outcome and a mix of continuous and categorical predictors. Upload raw rows, pick your success outcome, and compare the marginal effects of each predictor on conversion probabilities with confidence intervals and diagnostics.

👨‍🏫 Professor Mode: Guided Learning Experience

New to logistic regression? Enable Professor Mode for step-by-step guidance through building and interpreting your first model!

TEST OVERVIEW & EQUATIONS

Logistic regression estimates how the probability of a binary outcome (such as convert vs. not convert) changes with several predictors \(X_1, X_2, \dots, X_p\). Each coefficient shows the unique association of a predictor with the log-odds of success, holding the others constant.

Model: $$ \log\left(\frac{p_i}{1 - p_i}\right) = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \dots + \beta_p X_{pi} $$ where \(p_i = \Pr(Y_i = 1 \mid X_{1i}, \dots, X_{pi})\).

Coefficient tests: $$ z_j = \frac{\hat{\beta}_j}{\mathrm{SE}(\hat{\beta}_j)} $$ with p-values based on the standard normal distribution.

Model comparison: $$ \Delta D = D_{\text{null}} - D_{\text{model}} $$ can be compared using a chi-square test to see whether the predictors, as a set, improve fit vs. an intercept-only model.

Binary Outcome & Predictor Types

The outcome must be coded as a binary variable (for example, 0/1 or success/failure). Continuous predictors use their numeric scale. Categorical predictors are dummy-coded with a reference level, so each coefficient compares a category to its reference while holding other predictors constant.

MARKETING SCENARIOS

Use presets to explore realistic use cases, such as ad spend vs. revenue or control vs. treatment on order value. Each scenario can expose either a summary CSV of aggregated statistics or a raw data file that you can download, edit in Excel, and re-upload.

INPUTS & SETTINGS

Upload Raw Data File

Upload a CSV file with raw case-level data. Include one binary outcome column (0/1 or two categories) and multiple predictors (numeric or categorical). Headers are required.

Drag & Drop raw data file (.csv, .tsv, .txt)

Include headers; at least one binary outcome column and predictors (numeric or text for categorical).

No file uploaded.

Confidence Level & Reporting

Set the significance level for hypothesis tests and confidence intervals.

Standardization affects model fitting and effect plots only. Summary statistics always report predictors on their original scale.

VISUAL OUTPUT

Confusion Matrix

Advanced Options
(Pr ≥ threshold → predict 1)

Interpretation Aid

The confusion matrix shows how well the model classifies observations into the two outcome categories. The rows represent the actual outcomes (0 or 1) and columns represent the predicted outcomes. Diagonal cells (true positives and true negatives) indicate correct predictions; off-diagonal cells show errors. Adjust the classification threshold to balance sensitivity (correctly identifying 1s) versus specificity (correctly identifying 0s) based on your business priorities.

Classification Metrics: Accuracy = overall % correct. Sensitivity (Recall) = % of actual 1s correctly identified. Specificity = % of actual 0s correctly identified. Precision (PPV) = % of predicted 1s that are actually 1. F1 Score = harmonic mean of precision and recall (balances both). NPV = % of predicted 0s that are actually 0.

Classification Performance (at threshold = 0.5)

Accuracy:
Sensitivity (Recall):
Specificity:
Precision (PPV):
F1 Score:
Negative Predictive Value:

ROC Curve

AUC (Area Under Curve):
Interpretation Aid

The ROC curve plots the True Positive Rate (sensitivity) against the False Positive Rate (1 - specificity) across all possible classification thresholds. A curve closer to the top-left corner indicates better discrimination. The Area Under Curve (AUC) summarizes overall performance: 1.0 = perfect discrimination, 0.5 = no better than random guessing, < 0.5 = worse than random. AUC > 0.7 is often considered acceptable, > 0.8 good, > 0.9 excellent.

Reading the curve: Each point on the curve represents a different classification threshold. Hover over any point to see which threshold it represents (along with its TPR and FPR). Key thresholds (0.3, 0.5, 0.7) are marked with red dots and labeled. Moving right along the curve means lowering the threshold (classifying more cases as positive), which increases both true positives (good) and false positives (bad). The caption shows specific threshold examples for easy reference.

Predicted Probability Distribution

Interpretation Aid

These overlapping histograms show the distribution of predicted probabilities for cases where the outcome was actually 0 (blue) versus actually 1 (red). Good model discrimination means the distributions are well-separated—cases with outcome = 1 should cluster at higher predicted probabilities. Heavy overlap suggests the model has difficulty distinguishing the two groups. This complements the confusion matrix by showing the full probability spectrum before thresholding.

Predicted probabilities vs. focal predictor

Focal range (continuous):

Hold other predictors constant

Choose levels/values for the non-focal predictors used when plotting the focal curve.

Interpretation Aid

The line (or bars for categorical focals) shows the predicted probability that the focal outcome (coded as 1) occurs while holding other predictors constant at chosen values. Steeper slopes or larger gaps between bars imply stronger effects. Confidence bands/bars reflect the statistical uncertainty for those probabilities; wider bands mean less certainty. If bands for different settings overlap heavily, the model may not distinguish them well at those values.

Variable Importance (Odds Ratios)

Interpretation Aid

This forest plot displays odds ratios (OR) with 95% confidence intervals for each predictor. Odds ratios show multiplicative effects: OR > 1 means the predictor increases odds of the focal outcome, OR < 1 decreases odds, OR = 1 (dashed reference line) means no effect. For example, OR = 2.0 means doubling the odds (100% increase), OR = 0.5 means halving the odds (50% decrease), OR = 1.5 means 50% increase.

The horizontal bars are 95% confidence intervals showing statistical uncertainty. If a bar crosses the 1.0 line, the effect is not statistically significant (could be no effect). Longer bars = more uncertainty. Variables are sorted by distance from 1.0 to show strongest effects first. The x-axis is log-scaled so equal distances represent equal multiplicative effects.

SUMMARY STATISTICS

Summary Statistics

Outcome Variable

Variable % Focal Outcome Count Focal Count Non-Focal Total n
Provide data to see outcome summary.

Continuous Predictors

Variable Mean Median Std. Dev. Min Max
Provide data to see summary statistics.

Categorical Predictors (% by level)

Predictor Level Percent
Provide data to see level percentages.

TEST RESULTS

Regression Equation

Provide data to see the fitted regression equation.

The downloaded file includes your original raw data plus two columns: p_hat (the model’s predicted probability of the focal outcome for each observation) and neg_loglik_contribution, the individual contribution to the negative log-likelihood penalty used to fit the model.

Log-likelihood:
Null deviance:
Residual deviance:
Model chi-square:
Model p-value:
Pseudo R-squared:
Sample size (n):
Alpha:
Interpretation Aid

Log-likelihood / Deviance: Log-likelihood measures how well the model explains the observed pattern of 0/1 outcomes; deviance is a scaled version that compares the fitted model to a saturated one. Lower deviance means better fit.

Model chi-square & p-value: Compares the fitted model to an intercept-only (null) model using the difference in deviance. A small p-value (< alpha) means the predictors, as a set, improve the ability to predict success vs. failure.

Pseudo R-squared: A rough analogue of R-squared that summarizes how much the model improves fit relative to the null model. It is useful as a descriptive measure but should not be overinterpreted as "percent of variance explained."

n: Sample size and available information for estimating effects. Very small n can make estimates unstable or produce separation issues where a predictor perfectly predicts the outcome.

Alpha: Your chosen significance level. P-values below alpha are treated as statistically reliable; above alpha are treated as not statistically reliable.

Managerial Interpretation

APA-Style Report

Coefficient Estimates (Log-odds and Odds Ratios)

Predictor Level / Term Estimate (log-odds) Standard Error z p-value Odds Ratio Lower Bound Upper Bound
Provide data to see coefficient estimates.
Interpretation Aid

DIAGNOSTICS & ASSUMPTIONS

Diagnostics & Assumption Checks

Run the analysis to see checks on multicollinearity, variance patterns, and normality of residuals. Use these as prompts for plots and follow-up modeling, not as strict pass/fail gates.

Hosmer-Lemeshow Goodness-of-Fit Test

Run the analysis to see calibration test.

What does this test?

The Hosmer-Lemeshow test checks if predicted probabilities are well-calibrated—i.e., do cases with predicted probability ≈ 70% actually have the outcome 70% of the time? The test groups observations by predicted probability and compares observed vs. expected frequencies. A large p-value (> 0.05) suggests good calibration (no evidence of poor fit). A small p-value (< 0.05) suggests the model's probability estimates may be systematically biased, even if classification accuracy is high.

Actual vs. Fitted

Interpretation Aid

Each point plots a fitted probability on the horizontal axis and the observed 0/1 outcome (with a small amount of vertical jitter for visibility) on the vertical axis. Points clustered near 0 or 1 on the x-axis indicate confident predictions; a mix of 0s and 1s at similar fitted probabilities indicates uncertainty. Strong patterns or obvious outliers can signal model misspecification or influential cases to review.

Residuals vs. Fitted