Z-Score Reversion Strategy
Build a statistical mean-reversion strategy using z-score of price relative to a rolling mean — with entry/exit threshold tuning.
Strategy — Z-Score Mean Reversion
1–2. Installation and Imports
import warnings; warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
!pip install pandas numpy plotlyRequirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (2.2.2) Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (2.0.2) Requirement already satisfied: plotly in /usr/local/lib/python3.12/dist-packages (5.24.1) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas) (2026.1) Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.12/dist-packages (from plotly) (9.1.4) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from plotly) (26.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
3. Strategy Overview
The Z-score measures how many standard deviations the current price is from its rolling mean:
Z = (Close − Rolling Mean) / Rolling Standard Deviation
Signal logic:
- Z < −threshold (e.g., −2.0) → Buy (+1): price is 2 standard deviations below its recent average — statistically extreme low, mean reversion expected upward.
- Z > +threshold (e.g., +2.0) → Sell (−1): price is 2 standard deviations above its recent average — statistically extreme high, mean reversion expected downward.
- |Z| < threshold → No signal (0): price is within normal deviation range.
Relationship to Bollinger Bands: The Z-score strategy is mathematically equivalent to Bollinger Band mean reversion. The difference is that Z-score is expressed as a unit-free standardized number — this makes it directly comparable across assets with different price levels and volatilities, and more convenient for use as a machine learning feature.
Why it works: Under the assumption that returns are approximately normally distributed over short windows, Z-scores beyond ±2 occur with only ≈5% probability. The mean reversion trade captures the statistical tendency of prices to return toward their short-term average after extreme deviations.
4. Data Generation
def generate_data(periods: int) -> pd.DataFrame:
start_date = pd.to_datetime("2024-01-01 00:00:00+00:00")
datetime_index = pd.date_range(start_date, periods=periods, freq="1min", tz="UTC")
price_data = []; last_close = 42000
for i in range(periods):
open_price = last_close + np.random.normal(0, last_close * 0.0005)
close_price = open_price + np.random.normal(0, last_close * 0.005)
body_high = max(open_price, close_price)
body_low = min(open_price, close_price)
high_price = max(body_high + abs(np.random.normal(0, last_close * 0.002)), open_price, close_price)
low_price = min(body_low - abs(np.random.normal(0, last_close * 0.002)), open_price, close_price)
if high_price < low_price: high_price, low_price = low_price, high_price
price_data.append({"open": max(1,int(open_price)), "high": max(1,int(high_price)),
"low": max(1,int(low_price)), "close": max(1,int(close_price))})
last_close = close_price
df = pd.DataFrame(price_data, index=datetime_index)
df.index.name = "datetime"
df["volume"] = np.random.uniform(100.0, 500.0, periods)
df["datetime"] = df.index.to_series()
return df.reset_index(drop=True)
df = generate_data(500)
display(df.head())| open | high | low | close | volume | datetime | |
|---|---|---|---|---|---|---|
| 0 | 41981 | 42248 | 41826 | 42185 | 314.079864 | 2024-01-01 00:00:00+00:00 |
| 1 | 42149 | 42288 | 42145 | 42165 | 396.336709 | 2024-01-01 00:01:00+00:00 |
| 2 | 42191 | 42194 | 42131 | 42146 | 474.275682 | 2024-01-01 00:02:00+00:00 |
| 3 | 42165 | 42256 | 42123 | 42125 | 119.522158 | 2024-01-01 00:03:00+00:00 |
| 4 | 42123 | 42251 | 41747 | 41781 | 487.253949 | 2024-01-01 00:04:00+00:00 |
5. Strategy Function
def zscore_reversion_strategy(
df: pd.DataFrame,
window: int = 20,
entry_z: float = 2.0,
) -> pd.DataFrame:
df = df.copy().sort_values("datetime", ignore_index=True)
roll_mean = df["close"].rolling(window).mean()
roll_std = df["close"].rolling(window).std()
df["zscore"] = (df["close"] - roll_mean) / roll_std.replace(0, np.nan)
df["signal"] = np.where(df["zscore"] < -entry_z, 1,
np.where(df["zscore"] > entry_z, -1, 0))
return df
df_signals = zscore_reversion_strategy(df, window=20, entry_z=2.0)
print("--- Signal Distribution ---")
print(df_signals["signal"].value_counts())
print("\n--- Z-Score Statistics ---")
print(df_signals["zscore"].describe().round(4))--- Signal Distribution --- signal 0 447 1 36 -1 17 Name: count, dtype: int64 --- Z-Score Statistics --- count 481.0000 mean -0.4286 std 1.2630 min -2.8989 25% -1.3968 50% -0.7337 75% 0.5272 max 2.7887 Name: zscore, dtype: float64
Explanation:
roll_std.replace(0, np.nan): Prevents division by zero during the warm-up period before the rolling window fills.- The Z-score is dimensionless — the same threshold (e.g., ±2.0) applies regardless of whether the asset price is $42,000 or $2. This makes Z-score thresholds directly transferable across assets without rescaling.
- The
windowparameter controls the reference period for the mean and standard deviation — shorter windows make the signal more sensitive to recent price changes; longer windows make it more stable but slower to react.
6. Visualization
buy_signals = df_signals[df_signals["signal"] == 1]
sell_signals = df_signals[df_signals["signal"] == -1]
fig = make_subplots(rows=2, cols=1, shared_xaxes=True,
subplot_titles=["Price + Signals", "Z-Score"],
row_heights=[0.65, 0.35])
fig.add_trace(go.Candlestick(
x=df_signals["datetime"],
open=df_signals["open"], high=df_signals["high"],
low=df_signals["low"], close=df_signals["close"],
name="Price"), row=1, col=1)
fig.add_trace(go.Scatter(
x=buy_signals["datetime"], y=buy_signals["low"] * 0.999,
mode="markers", marker=dict(symbol="triangle-up", size=10, color="green"), name="Buy (+1)"),
row=1, col=1)
fig.add_trace(go.Scatter(
x=sell_signals["datetime"], y=sell_signals["high"] * 1.001,
mode="markers", marker=dict(symbol="triangle-down", size=10, color="red"), name="Sell (−1)"),
row=1, col=1)
fig.add_trace(go.Scatter(
x=df_signals["datetime"], y=df_signals["zscore"],
mode="lines", name="Z-Score", line=dict(color="purple", width=1)), row=2, col=1)
fig.add_hline(y= 2.0, line_dash="dash", line_color="red", row=2, col=1, annotation_text="+2σ")
fig.add_hline(y=-2.0, line_dash="dash", line_color="green", row=2, col=1, annotation_text="−2σ")
fig.add_hline(y= 0, line_dash="dot", line_color="gray", row=2, col=1)
fig.update_layout(
title_text="Z-Score Mean Reversion Strategy",
xaxis_rangeslider_visible=False,
height=700, yaxis=dict(autorange=True),
xaxis2_title="Datetime", yaxis2_title="Z-Score",
)
fig.show()