Borrow-Cost Correction: Synthetic LETF Methodology + Revised Numbers
Borrow-Cost Correction: Synthetic LETF Methodology + Revised Numbers
What was wrong
Every research script before 2026-05-19 built synthetic LETFs with the simplified daily-return formula:
daily_return = L * underlying_return - ER/252
This omits the daily borrow cost real LETFs pay on the leveraged portion. The full Testfolio-compatible formula is:
daily_return = L * underlying_return
- ER/252
- (L - 1) * (borrow_rate + spread) / 252
The third term models the fact that a 3x fund holds $300 of underlying exposure per $100 of capital, financing the $200 gap. Real LETFs implement this through swap contracts that pay roughly effective Fed Funds + 30-50bps.
Calibration vs real TQQQ
Real TQQQ, 2015-01-05 to 2024-12-31 (10 years, includes ZIRP era + 2022 rate shock):
| Method | Final wealth multiple | Drift vs real |
|---|---|---|
| Real TQQQ (live fund) | 20.41x | — |
| Simple formula (no borrow) | 33.00x | +62% |
| Borrow-modeled (^IRX + 40bps) | 21.46x | +5% |
The simple formula overshoots by 62%. The borrow-modeled formula tracks real TQQQ to within ~5% (residual is plausibly tracking error + the 40bps spread being a midpoint estimate of ProShares' actual swap pricing).
New harness
Open-source as sma200-bt on GitHub (MIT licensed):
pip install sma200-bt
from sma200_bt import synthetic_letf_returns, fetch_tbill_rate
qqq_ret = qqq.pct_change()
tbill = fetch_tbill_rate(qqq_ret.index[0], qqq_ret.index[-1])
tqqq_syn = synthetic_letf_returns(
qqq_ret,
leverage=3.0,
expense_ratio=0.0086,
borrow_rate=tbill,
borrow_spread=0.0040, # 40bps over T-bill, ProShares swap-typical
)
Six pytest cases pin the formula behavior: simple-mode parity, high-rate drag, zero-rate parity, inverse leverage edge case, Series alignment, compound helper.
For backwards compat, borrow_rate=None falls back to the old simple formula but emits a logging warning.
Revised: Synthetic TQQQ since 1999
Window: 1999-03-11 to 2026-05-15 (27.2 years). Methodology: synthetic_letf_returns + SMA200 filter on QQQ underlying, applied one day lagged to the synthetic TQQQ sleeve.
| Metric | B&H | SMA200 filter |
|---|---|---|
| Final $10k → | $25,575 | $183,452 |
| CAGR | 3.52% | 11.30% |
| Max drawdown | -99.98% | -95.39% |
| Sharpe (rf=0) | 0.448 | 0.467 |
| Sortino | 0.593 | 0.502 |
Sharpe diff (filter − B&H): +0.019.
What changed vs the May 14 version
| Metric | Old (simple formula) | New (borrow-modeled) | Δ |
|---|---|---|---|
| B&H CAGR | 8.46% | 3.52% | −4.94 pp |
| Sharpe diff (filter − B&H) | +0.141 | +0.019 | −0.122 |
The simple-formula numbers were overstating the B&H story AND the filter's improvement. The filter's apparent rescue of leveraged ETFs largely evaporates when financing costs are honestly modeled.
Mechanism: the filter sidesteps drawdowns (good), but during LONG periods you're still paying the borrow cost. Worst-case drag is concentrated in high-rate regimes (1970s, early-2000s, 2022+) which are also when equities tend to be choppy or down — exactly when the filter is most often in the LONG state being eaten by the borrow drag.
Revised: Defensive bucket comparison
Window: 2004-11-18 to 2026-05-15 (21.5 years, common-start of SPY+GLD+TLT). Methodology: SMA200 on SPY, 100% synthetic 3x SPY (UPRO-equivalent, borrow-modeled) when long, defensive bucket when flat.
Sorted by Sharpe, best first:
| Defensive bucket | CAGR | MaxDD | Sharpe | vs old Sharpe |
|---|---|---|---|---|
| 100% Gold (GLD) | 19.69% | -59.4% | 0.683 | 0.797 (−0.114) |
| 50% Gold + 50% TBILL | 18.88% | -52.6% | 0.676 | 0.786 (−0.110) |
| 50% Gold + 50% TLT | 18.85% | -57.2% | 0.671 | 0.784 (−0.113) |
| 33% Gold + 33% TLT + 34% TBILL | 18.54% | -54.8% | 0.668 | 0.778 (−0.110) |
| 100% TBILL | 17.71% | -53.5% | 0.651 | 0.755 (−0.104) |
| 100% TLT (ZROZ-proxy) | 17.49% | -62.1% | 0.635 | 0.744 (−0.109) |
| (baseline: B&H synthetic 3x SPY, no filter) | 16.19% | -95.5% | 0.552 | — |
Sharpe levels drop ~0.10 across all variants (borrow cost was uniformly missing in the May 17 numbers). The relative ordering is preserved — gold > combinations > cash > TLT — and the headline findings hold:
- Gold helps. +0.032 Sharpe over plain TBILL (was +0.042 before — slightly smaller edge, same sign).
- TLT/ZROZ does NOT help, arguably hurts. Worse Sharpe than plain TBILL AND the worst max DD in the test.
So the durable findings #2 and #3 from the index don't flip — but the magnitudes get a haircut and the "production-grade" claim is restored.
New: Managed Futures (DBMF) defensive bucket
DBMF only has data from 2019-05-08, so a clean test required a shorter 7-year subwindow:
| Defensive bucket | CAGR | MaxDD | Sharpe |
|---|---|---|---|
| 100% Gold | 27.64% | -52.2% | 0.851 |
| 50% MF + 50% Gold | 26.79% | -49.2% | 0.844 |
| 100% DBMF | 25.48% | -46.9% | 0.812 |
| 33% MF + 33% GLD + 34% TBILL | 25.11% | -49.5% | 0.810 |
| 25% each: MF + GLD + ZROZ + TBILL | 24.13% | -52.8% | 0.787 |
| 100% TBILL | 21.75% | -50.2% | 0.736 |
| 100% TLT | 20.54% | -62.1% | 0.691 |
- DBMF as sole defensive: +0.076 Sharpe over TBILL. Best single-asset max DD in the test (-46.9%).
- MF + Gold combos compete with pure gold with slightly better drawdown.
- The "25/25/25/25 across MF + GLD + ZROZ + TBILL" mix underperforms gold-heavy variants by ~0.06 Sharpe. The ZROZ slice is the load on the combination.
Caveat: 7y window includes 2022 where managed futures strategies were uniquely positioned to profit from trend reversals in bonds, rates, and energy. Whether that persists in different regimes is unknown. Treat the +0.076 Sharpe edge as regime-dependent until a longer MF index series (e.g., SG CTA Index back to 2000) confirms it.
What this changes in the durable findings
Updated stance for index of research/README.md:
| Finding | Old | New |
|---|---|---|
| #1 (SMA200 filter is real) | "Sharpe 0.60-0.74 on synthetic 3x SPY across 27 and 33 year windows" | Restated: Sharpe 0.46-0.68 with borrow cost. Real and durable on UNLEVERAGED equity; the LEVERAGED Sharpe edge is much narrower (+0.019 on synthetic TQQQ 1999-2026 vs +0.141 in May 14 numbers). Filter is best understood as a drawdown reducer on leverage, not a Sharpe enhancer. |
| #2 (gold defensive +0.04 Sharpe) | "+0.04 Sharpe vs cash" | Restated: +0.032 Sharpe vs cash (slightly tighter, same sign). |
| #3 (TLT/ZROZ don't defend) | "100% TLT gave the worst Sharpe... AND the worst MDD" | Unchanged. Borrow-cost correction doesn't affect the defensive instruments (only the leveraged sleeve). TLT still finishes worst on both axes. |
| (new) #11 | — | Managed futures (DBMF) work as defensive in the 2019-2026 window. +0.076 Sharpe over TBILL, best single-asset max DD. Regime-dependent until longer data confirms. |
Source
Runner: ad-hoc Python using trader.backtest.synthetic.synthetic_letf_returns + fetch_tbill_rate. Inline reproduction:
import sys; sys.path.insert(0, '/Volumes/Mac External/Claudes/trader/src')
import warnings; warnings.filterwarnings('ignore')
import numpy as np, pandas as pd
import yfinance as yf
from trader.backtest.synthetic import (
synthetic_letf_returns, fetch_tbill_rate, compound, TRADING_DAYS,
)
# Synthetic TQQQ since 1999
qqq = yf.download('QQQ', start='1999-03-10', end='2026-05-16',
auto_adjust=False, progress=False)['Adj Close']
qqq_ret = qqq.pct_change().dropna()
tbill = fetch_tbill_rate(qqq_ret.index[0], qqq_ret.index[-1])
tqqq_ret = synthetic_letf_returns(qqq_ret, leverage=3.0, expense_ratio=0.0086,
borrow_rate=tbill, borrow_spread=0.0040)
sma200 = qqq.rolling(200).mean()
above = (qqq > sma200).shift(1).fillna(False).reindex(tqqq_ret.index)
filt = tqqq_ret.where(above, 0.0)
# metrics(...): cagr, mdd, sharpe, sortino from equity = (1+r).cumprod()
Why this happened + going forward
The simple formula was a research-tool shortcut: easier to write, no second data source, and at ZIRP rates (most of 2010-2021) it produces nearly identical results to the rigorous formula. The mistake was using it for windows where rates were materially non-zero (1999-2008, 2022+, 1940-1980) and treating the output as production-grade.
Policy going forward: any synthetic LETF used in a Reddit post, /learn article, or production-grade research file MUST use synthetic_letf_returns with a real borrow_rate. The simple-mode fallback exists only for reproducing old numbers.
Tests in tests/test_synthetic_letf.py enforce the formula. CI will catch regressions.
Revised: Portfolio archetypes (May 17 file audit)
Re-ran the headline portfolios from Portfolio archetypes search with proper borrow cost on UPRO, SSO, UGL synthetics. Window: 2000-08-30 to 2026-05-19 (25.7y).
Key portfolios: B&H vs SMA200-on-equity
| Portfolio | B&H Sharpe (new) | +SMA Sharpe (new) | Filter lift | B&H MDD | +SMA MDD |
|---|---|---|---|---|---|
| 75 UPRO / 25 UGL (equity-heavy benchmark) | 0.536 | 0.774 | +0.238 | -86.8% | -40.1% |
| 33 UPRO / 67 UGL (sweep optimal) | 0.745 | 0.804 | +0.059 | -61.6% | -41.5% |
| 40 UPRO / 50 UGL / 10 cash (N, headline) | 0.716 | 0.852 | +0.136 | -62.3% | -33.3% |
| 50 SSO / 40 UGL / 10 cash (L) | 0.722 | 0.869 | +0.147 | -54.4% | -27.3% |
| 25 UPRO / 75 UGL | 0.743 | 0.762 | +0.019 | -58.2% | -51.1% |
Sharpe deltas vs the May 17 numbers
| Portfolio | Old +SMA Sharpe | New +SMA Sharpe | Δ |
|---|---|---|---|
| 40 UPRO / 50 UGL / 10 cash (N) | 0.934 | 0.852 | −0.082 |
| 50 SSO / 40 UGL / 10 cash (L) | 0.933 | 0.869 | −0.064 |
Headline "Sharpe 0.934" becomes "Sharpe 0.852". Still genuinely good, but less viral.
UPRO/UGL ratio sweep (B&H, no filter), 25.7y
| UPRO% | UGL% | CAGR | MDD | Sharpe |
|---|---|---|---|---|
| 0% | 100% | 16.81% | -74.9% | 0.617 |
| 20% | 80% | 19.25% | -56.1% | 0.730 |
| 25% | 75% | 19.53% | -58.2% | 0.743 |
| 33% | 67% | 19.69% | -61.6% | 0.745 |
| 40% | 60% | 19.54% | -64.7% | 0.728 |
| 50% | 50% | 18.87% | -69.5% | 0.682 |
| 75% | 25% | 14.85% | -86.8% | 0.536 |
| 100% | 0% | 7.78% | -97.9% | 0.421 |
Sharpe-optimal ratio stays at 33% UPRO / 67% UGL (Sharpe 0.745, was 0.808). The equity-heavy 75/25 benchmark still gives Sharpe 0.536 (was 0.604), confirming heavy-gold > heavy-equity even with corrected borrow cost.
Refined interpretation
The May 17 "best non-BTC Sharpe 0.934" claim was inflated by ~0.08 Sharpe due to borrow-cost omission. All directional findings hold:
- Heavy gold beats heavy equity in synthetic-LETF Sharpe terms.
- SMA200 on equity sleeve adds meaningful Sharpe lift across all portfolios, biggest on the most-leveraged (+0.238 on 75/25).
- MDD reduction from the filter is even larger than Sharpe lift suggests — 75/25 portfolio sees MDD go from -86.8% to -40.1%.
Crucial refined insight: the filter's Sharpe value scales with how much vol-decay risk is in the portfolio. Pure 3x leveraged ETFs (where the May 19 correction showed filter lift drops to +0.019) get NO benefit because there's no defensive allocation to redirect into during the OFF periods. But in a portfolio with substantial gold allocation, the filter routes the equity sleeve's drawdowns to a productive defensive bucket, compounding the protection.
This is the right way to frame the SMA200 filter going forward: it's a portfolio-construction tool, not a single-asset rescue mechanism. The Reddit/Bogleheads framing "does SMA200 save TQQQ B&H" is the wrong question.
Related studies
- 2026-05-14_sma200_extended_windows_synthetic.md — superseded synthetic LETF numbers; underlying S&P decade table is unaffected and stands.
- Defensive bucket comparison — superseded numbers; relative ordering preserved.
- Portfolio archetypes search — superseded numbers; directional findings strengthened (filter is a portfolio-level tool, not a single-asset rescue).
src/trader/backtest/synthetic.py— new canonical harness.tests/test_synthetic_letf.py— formula pin.
This is research output, not investment advice. Backtest results do not predict future returns. Specific portfolio compositions discussed here are illustrative test cases, not allocation recommendations. Do your own research and consult a licensed advisor for personalized advice. Full disclaimer →