Notebooks/Z-Score Reversion Strategy
Signals·TA Strategies·Intermediate

Z-Score Reversion Strategy

Build a statistical mean-reversion strategy using z-score of price relative to a rolling mean — with entry/exit threshold tuning.

z-scorestatisticsmean reversion

Strategy — Z-Score Mean Reversion


1–2. Installation and Imports

[1]
import warnings; warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

!pip install pandas numpy plotly
Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (2.2.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.12/dist-packages (2.0.2)
Requirement already satisfied: plotly in /usr/local/lib/python3.12/dist-packages (5.24.1)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas) (2026.1)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.12/dist-packages (from plotly) (9.1.4)
Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from plotly) (26.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)

3. Strategy Overview

The Z-score measures how many standard deviations the current price is from its rolling mean:

Z = (Close − Rolling Mean) / Rolling Standard Deviation

Signal logic:

  • Z < −threshold (e.g., −2.0) → Buy (+1): price is 2 standard deviations below its recent average — statistically extreme low, mean reversion expected upward.
  • Z > +threshold (e.g., +2.0) → Sell (−1): price is 2 standard deviations above its recent average — statistically extreme high, mean reversion expected downward.
  • |Z| < threshold → No signal (0): price is within normal deviation range.

Relationship to Bollinger Bands: The Z-score strategy is mathematically equivalent to Bollinger Band mean reversion. The difference is that Z-score is expressed as a unit-free standardized number — this makes it directly comparable across assets with different price levels and volatilities, and more convenient for use as a machine learning feature.

Why it works: Under the assumption that returns are approximately normally distributed over short windows, Z-scores beyond ±2 occur with only ≈5% probability. The mean reversion trade captures the statistical tendency of prices to return toward their short-term average after extreme deviations.


4. Data Generation

[2]
def generate_data(periods: int) -> pd.DataFrame:
    start_date     = pd.to_datetime("2024-01-01 00:00:00+00:00")
    datetime_index = pd.date_range(start_date, periods=periods, freq="1min", tz="UTC")
    price_data = []; last_close = 42000
    for i in range(periods):
        open_price  = last_close + np.random.normal(0, last_close * 0.0005)
        close_price = open_price + np.random.normal(0, last_close * 0.005)
        body_high   = max(open_price, close_price)
        body_low    = min(open_price, close_price)
        high_price  = max(body_high + abs(np.random.normal(0, last_close * 0.002)), open_price, close_price)
        low_price   = min(body_low  - abs(np.random.normal(0, last_close * 0.002)), open_price, close_price)
        if high_price < low_price: high_price, low_price = low_price, high_price
        price_data.append({"open": max(1,int(open_price)), "high": max(1,int(high_price)),
                            "low":  max(1,int(low_price)),  "close": max(1,int(close_price))})
        last_close = close_price
    df = pd.DataFrame(price_data, index=datetime_index)
    df.index.name = "datetime"
    df["volume"] = np.random.uniform(100.0, 500.0, periods)
    df["datetime"] = df.index.to_series()
    return df.reset_index(drop=True)

df = generate_data(500)
display(df.head())
open high low close volume datetime
0 41981 42248 41826 42185 314.079864 2024-01-01 00:00:00+00:00
1 42149 42288 42145 42165 396.336709 2024-01-01 00:01:00+00:00
2 42191 42194 42131 42146 474.275682 2024-01-01 00:02:00+00:00
3 42165 42256 42123 42125 119.522158 2024-01-01 00:03:00+00:00
4 42123 42251 41747 41781 487.253949 2024-01-01 00:04:00+00:00

5. Strategy Function

[3]
def zscore_reversion_strategy(
    df:       pd.DataFrame,
    window:   int   = 20,
    entry_z:  float = 2.0,
) -> pd.DataFrame:
    df = df.copy().sort_values("datetime", ignore_index=True)

    roll_mean    = df["close"].rolling(window).mean()
    roll_std     = df["close"].rolling(window).std()
    df["zscore"] = (df["close"] - roll_mean) / roll_std.replace(0, np.nan)

    df["signal"] = np.where(df["zscore"] < -entry_z,  1,
                   np.where(df["zscore"] >  entry_z, -1, 0))

    return df

df_signals = zscore_reversion_strategy(df, window=20, entry_z=2.0)

print("--- Signal Distribution ---")
print(df_signals["signal"].value_counts())
print("\n--- Z-Score Statistics ---")
print(df_signals["zscore"].describe().round(4))
--- Signal Distribution ---
signal
 0    447
 1     36
-1     17
Name: count, dtype: int64

--- Z-Score Statistics ---
count    481.0000
mean      -0.4286
std        1.2630
min       -2.8989
25%       -1.3968
50%       -0.7337
75%        0.5272
max        2.7887
Name: zscore, dtype: float64

Explanation:

  • roll_std.replace(0, np.nan): Prevents division by zero during the warm-up period before the rolling window fills.
  • The Z-score is dimensionless — the same threshold (e.g., ±2.0) applies regardless of whether the asset price is $42,000 or $2. This makes Z-score thresholds directly transferable across assets without rescaling.
  • The window parameter controls the reference period for the mean and standard deviation — shorter windows make the signal more sensitive to recent price changes; longer windows make it more stable but slower to react.

6. Visualization

[4]
buy_signals  = df_signals[df_signals["signal"] ==  1]
sell_signals = df_signals[df_signals["signal"] == -1]

fig = make_subplots(rows=2, cols=1, shared_xaxes=True,
    subplot_titles=["Price + Signals", "Z-Score"],
    row_heights=[0.65, 0.35])

fig.add_trace(go.Candlestick(
    x=df_signals["datetime"],
    open=df_signals["open"], high=df_signals["high"],
    low=df_signals["low"],   close=df_signals["close"],
    name="Price"), row=1, col=1)

fig.add_trace(go.Scatter(
    x=buy_signals["datetime"],  y=buy_signals["low"]  * 0.999,
    mode="markers", marker=dict(symbol="triangle-up",   size=10, color="green"), name="Buy (+1)"),
    row=1, col=1)
fig.add_trace(go.Scatter(
    x=sell_signals["datetime"], y=sell_signals["high"] * 1.001,
    mode="markers", marker=dict(symbol="triangle-down", size=10, color="red"),   name="Sell (−1)"),
    row=1, col=1)

fig.add_trace(go.Scatter(
    x=df_signals["datetime"], y=df_signals["zscore"],
    mode="lines", name="Z-Score", line=dict(color="purple", width=1)), row=2, col=1)
fig.add_hline(y= 2.0, line_dash="dash", line_color="red",   row=2, col=1, annotation_text="+2σ")
fig.add_hline(y=-2.0, line_dash="dash", line_color="green", row=2, col=1, annotation_text="−2σ")
fig.add_hline(y= 0,   line_dash="dot",  line_color="gray",  row=2, col=1)

fig.update_layout(
    title_text="Z-Score Mean Reversion Strategy",
    xaxis_rangeslider_visible=False,
    height=700, yaxis=dict(autorange=True),
    xaxis2_title="Datetime", yaxis2_title="Z-Score",
)
fig.show()
[ ]