Upload Time Series Data
Upload a CSV with at least a date/time column and a numeric outcome column. Optional columns for exogenous predictors (ad spend, promotions, etc.) can be included.
Drag & drop time series file
CSV with columns: date, outcome, [optional exogenous variables]
Column Selection
Select which columns represent the time period, outcome (Y), and any exogenous predictors (X).
Date/Time Format Help
Supported date formats:
YYYY-MM-DD(e.g., 2024-01-15)YYYY-MM(e.g., 2024-01)MM/DD/YYYY(e.g., 01/15/2024)DD/MM/YYYY(e.g., 15/01/2024)
Not supported:
- Spelled-out months (e.g., "January 2024")
- Relative dates (e.g., "Last week")
- Inconsistent formats within the same column
π‘ Surefire Method (Recommended)
If you're having trouble with date parsing, use simple sequential labels instead:
1, 2, 3, 4, ...(numeric sequence)t1, t2, t3, t4, ...(labeled sequence)Period1, Period2, Period3, ...Week1, Week2, Week3, ...orMonth1, Month2, ...
The model only needs to know the order of your observations — actual dates are just for labeling the output. Sequential numbers work perfectly!
Upload data to select exogenous predictors.
Model Specification
Set the ARIMA order (p, d, q). Use the diagnostics panel to help choose appropriate values.
Enable this if your data shows repeating patterns at regular intervals (e.g., weekly cycles, monthly patterns).
Analysis Settings
Used for coefficient p-values and significance stars.
VISUAL OUTPUT
Time Series with Forecasts
Upload data and fit the model to see the time series plot with forecasts and confidence intervals.
Interpretation Aid
The solid blue line shows historical values, while the red dashed line shows forecasts. The shaded area represents the confidence interval — wider bands indicate more uncertainty in future predictions. Forecasts depend on the assumed values for exogenous predictors above.
π SARIMAX forecasts: If you enabled seasonality, the forecast should show the expected seasonal pattern (ups and downs) based on where you are in the cycle. If the forecast looks "flat" despite obvious historical patterns, check: (1) Is the pattern truly seasonal (fixed calendar cycles) or event-driven? (2) Is your seasonal period (s) correct? (3) Do you have enough data (at least 2 full cycles)?
Residuals Diagnostics
Residuals Over Time
Residuals should appear randomly scattered around zero with no obvious patterns.
ACF of Residuals
PACF of Residuals
Interpretation Aid: Understanding Residual Diagnostics
π Residuals Over Time
Good model fit produces residuals that look like white noise: randomly distributed around zero with constant variance. Look for:
- No trends: Residuals shouldn't drift up or down over time
- Constant spread: The "band" of residuals should be roughly the same width throughout
- No patterns: Cycles or repeating structures suggest missed seasonality
π ACF & PACF Charts
ACF (Autocorrelation): Measures correlation with past values at each lag.
PACF (Partial Autocorrelation): Direct correlation at each lag, removing intermediate effects.
- Bars outside red lines: Statistically significant autocorrelation (potential problem)
- All bars inside red lines: Residuals are white noise (good!)
β οΈ Problem Patterns & Solutions
| Pattern | Meaning | Try This |
|---|---|---|
| Significant ACF spikes at lags 1, 2, 3... | MA terms needed | Increase q (MA order) |
| Significant PACF spikes at lags 1, 2, 3... | AR terms needed | Increase p (AR order) |
| Slow decay in ACF | Series not stationary | Increase d (differencing) |
| Spikes at seasonal lags (12, 24, 52...) | Seasonal pattern not captured | Enable "Include Seasonality" and set s to the lag with spikes |
SUMMARY STATISTICS
Descriptive Statistics
Outcome Variable
| Statistic | Value |
|---|---|
| Provide data to see summary statistics. | |
Exogenous Predictors
| Variable | Mean | Std. Dev. | Min | Max |
|---|---|---|---|---|
| Provide data to see predictor statistics. | ||||
MODEL RESULTS
Interpretation Aid
AIC/BIC: Lower values indicate better model fit relative to complexity. Use these to compare different (p,d,q) or (P,D,Q,s) specifications. Try fitting ARIMA vs SARIMAX and compare!
RMSE: Root Mean Square Error — average magnitude of prediction errors in the same units as the outcome.
MAE: Mean Absolute Error — average absolute deviation, less sensitive to outliers than RMSE.
Tip: If adding seasonality increases AIC/BIC, the seasonal pattern may not be strong enough to justify the extra complexity.
APA-Style Report
Fit the model to see the APA-style statistical report.
Managerial Interpretation
Business-focused interpretation will appear here after fitting the model.
Coefficient Estimates
| Parameter | Estimate | Std. Error | p-value | 95% CI |
|---|---|---|---|---|
| Fit the model to view coefficient estimates. | ||||
Interpretation Aid: Understanding Coefficients
π Types of Coefficients
AR Coefficients (ar.L1, ar.L2, ...)
What they mean: How much yesterday's (or earlier) values influence today's value, after accounting for the trend.
- Positive (e.g., 0.7): Strong persistence — if sales were high last period, they'll likely be high this period
- Negative (e.g., -0.3): Mean reversion — high values tend to be followed by lower values
- Close to 0: Past values don't strongly predict current values
- Close to 1: Random walk behavior — today β yesterday + noise
Example: ar.L1 = 0.65 means "about 65% of last period's deviation from the mean carries over to this period."
MA Coefficients (ma.L1, ma.L2, ...)
What they mean: How much past forecast errors affect current values. These capture short-term adjustments.
- Positive (e.g., 0.5): If we under-predicted last period, we adjust upward this period
- Negative (e.g., -0.4): If we under-predicted last period, we actually adjust downward (unusual)
Example: ma.L1 = 0.45 means "if our model was off by $100 last period, add about $45 to this period's prediction."
Exogenous Coefficients (your predictor names)
What they mean: The direct effect of each external variable on your outcome, holding time-series dynamics constant.
- Interpreted like standard regression coefficients
- A one-unit increase in the predictor changes the outcome by the coefficient value
Example: ad_spend = 2.3 means "each additional $1 spent on advertising is associated with $2.30 more in sales, after accounting for trends and seasonality."
SigmaΒ² (sigma2)
What it means: The estimated variance of the random error term. Larger values = more unexplained variability.
π Seasonal Coefficients (ar.S.L*, ma.S.L*) β SARIMAX Only
What they mean: These appear when you enable seasonality. They capture how values from the same season last cycle influence the current value.
- ar.S.L52: How this week's value relates to the same week last year (for s=52)
- ma.S.L12: Adjustment based on forecast errors from the same month last year (for s=12)
Example: ar.S.L52 = 0.8 means "80% of the deviation we saw in the same week last year carries over to this week."
If seasonal coefficients are non-significant, the seasonal pattern may be weak or you may have chosen the wrong period (s).
π Statistical Significance
- p-value < 0.05: Coefficient is statistically significant (highlighted in green)
- p-value > 0.05: Cannot rule out that the true effect is zero
- Confidence Interval: If it doesn't include 0, the coefficient is significant
Non-significant AR or MA terms might indicate you've over-specified the model. Try reducing p or q (or P/Q for seasonal terms).
β οΈ Common Issues
- Very large standard errors: Possible multicollinearity or insufficient data
- AR coefficient > 1: Model may be unstable; try increasing d (differencing)
- All exogenous coefficients non-significant: External variables may not help; try a simpler ARIMA model
- Seasonal coefficients non-significant: You may have the wrong seasonal period (s), or the pattern isn't truly seasonal
- Model takes very long to fit: Large seasonal periods (s=52) with high P/D/Q can be slow β try reducing to (1,0,1)
Forecasts
| Period | Forecast | Lower (95%) | Upper (95%) |
|---|---|---|---|
| Fit the model to view forecasts. Use the slider to select 0-10 forecast periods. | |||
DIAGNOSTICS & ASSUMPTIONS
Stationarity & Model Diagnostics
Click "Check Stationarity" or fit the model to see diagnostic tests.
Augmented Dickey-Fuller Test (ADF)
Tests whether the series has a unit root (non-stationary).
What is the ADF Test?
The ADF test checks if your time series is stationary (statistical properties don't change over time). ARIMA models require stationarity.
- p-value < 0.05: β Series is stationary. Good to proceed.
- p-value β₯ 0.05: β οΈ Series is non-stationary. Differencing (d > 0) is needed.
What makes a series non-stationary?
- Trends (consistently going up or down)
- Changing variance (volatility increases over time)
- Seasonal patterns with changing amplitude
The fix: Differencing (setting d=1 or d=2) removes trends. The model then works with changes rather than levels.
Ljung-Box Test (Residual Autocorrelation)
Tests whether residuals exhibit significant autocorrelation.
What is the Ljung-Box Test?
This test checks if there's leftover pattern in your residuals (the differences between actual and fitted values).
- p-value > 0.05: β No significant autocorrelation. Residuals look like random noise. Model is adequate.
- p-value β€ 0.05: β οΈ Significant autocorrelation detected. The model is missing some structure.
If you see significant autocorrelation:
- Try increasing p (AR order) if PACF shows spikes
- Try increasing q (MA order) if ACF shows spikes
- Check for seasonal patterns you haven't accounted for
- Consider adding more exogenous variables
Model Selection Guidance
Choosing p (AR order)
Look at the PACF plot of your original series:
- Count significant spikes (bars outside red lines)
- The number of spikes before cutoff = suggested p
- Typical values: 0, 1, or 2
PACF shows 2 significant spikes β try p=2
Choosing d (Differencing)
Based on the ADF test:
- ADF p-value < 0.05 β d=0 (already stationary)
- ADF p-value β₯ 0.05 β try d=1
- Still non-stationary with d=1 β try d=2
- Rarely need d > 2
Upward trend in data β likely need d=1
Choosing q (MA order)
Look at the ACF plot of your original series:
- Count significant spikes after lag 0
- Spikes that cut off sharply suggest MA terms
- Typical values: 0, 1, or 2
ACF shows 1 significant spike at lag 1 β try q=1
Common Starting Points
| Model | When to Use |
|---|---|
| ARIMA(1,1,1) | Good default for trending data with some persistence |
| ARIMA(0,1,1) | Random walk with smoothing (simple exponential smoothing) |
| ARIMA(1,0,0) | Stationary data with persistence (AR(1) process) |
| ARIMA(0,1,0) | Pure random walk (tomorrow = today + noise) |
| ARIMA(2,1,2) | More complex dynamics; try if simpler models fail diagnostics |
π‘ Model Comparison Tip
Try a few different (p, d, q) combinations and compare:
- AIC/BIC: Lower is better (penalizes complexity)
- Ljung-Box p-value: Should be > 0.05
- RMSE: Lower is better prediction accuracy
The best model balances fit (low RMSE) with simplicity (few parameters, low AIC).