Tactical trading using synthetic prices

This article is a continuation of my previous post on using synthetic prices generated by MCMC to select strategy parameters. In this article, we discuss how one can augment the framework defined in the previous post to generate specific scenario based prices. Tactical Investment Algorithms Paper by Marcos Lopez de Prado 1, is one of the best papers around that details the power of using synthetic prices and MC backtests to find trading strategies optimized to work in a particular market regime i.e. trading tactically.

It is generally difficult to create strategies that perform consistently across different market regimes, for example, a simple strategy that shorts VIX futures performs well during low volatility regimes, but rakes in a lot of losses during high volatility periods. The volmageddon event of 2018 and the COVID crisis 2020 are two prime examples where such a strategy would not only wipe out all the previous profits but also incur tremendous losses in a short period. Creating synthetic prices that reflect a particular regime can be a very effective tool in mapping strategy’s performance to different market regimes, this relationship (if significant) can be effectively used to switch on/off strategies or change allocations of strategies to maximize profitability (raw or risk-adjusted).

This article describes two methods via which one can generate synthetic prices for different market scenarios/regimes. We later show how one can use different scenarios to dynamically adjust the strategy parameters such that it works optimally across different market regimes. The market scenarios defined in this article reflect the different volatility regimes. More specifically, we see markets operating in 2 simple regimes low volatility regime and high volatility regime.

Regime Switching Model

The first method involves defining our model as a mixture of two student T random walks, that switch based on the likelihood of operating within these two distributions. The two student T distributions represent the 2 market regimes/states we’re trying to model (high/low volatility), and the likelihood of being in a particular state is defined using the Markov/stochastic matrix that contains the transition probabilities of the two states (shown below).

Graphically this model can be represented as

The posterior of this model will contain the two distributions corresponding to different regimes, as well as the likelihood of being in a particular regime. With this posterior, we can then draw samples, select a particular threshold of switching probability, and reconstruct composite synthetic prices using the two random walks (corresponding to 2 regimes). Once we generate the composite prices using the blocks of the prices/returns that belong to a particular regime, strategies can be backtested and its performance can be mapped to the two states/regimes. (based on the threshold of the transition probabilities)

One of the issues with the method described above is that the fitting process is computationally expensive. Such poor convergence properties are mostly down to an exponentially large number of parameters that need to be fit using our sampler. A quick workaround to this method is described below, which we will use in our toy example.

Independent Regime Models

The second method is a little crude and involves splitting the prices/log-returns based on different market regimes and fitting them independently using the model described in the previous article. One can find a plethora of models that define market regimes a few examples include using a Markov regime-switching model, that oscillates between 2 regimes based on the switching probability or the likelihood of being in a particular state, or one can even use a time-series clustering model to define market clusters that reflect different regimes.

In this article, we use a very simple model to define low and high volatility regimes, based on the VIX index. Since the security used in this toy example is SPY ETF, the VIX index seems like a fair reflection of its volatility regime. We say the market regime is currently in low volatility state if the exponential weighted moving average (ewm) of the VIX index is below a certain threshold and in high volatility if it above that threshold (we can have more regimes but for the sake of simplicity we move ahead with just 2 regimes). The code block below splits the original price returns into 2 data frames/CSV files based on ewm of VIX index and the regime threshold. As shown below we use a half_life of 10 days and VIX Threshold of 20 to define our regimes. The two files are saved are SPY_regime_-1.0_ewm.csv and SPY_regime_0.0_ewm.csv, each file representing low and high volatility regimes respectively.

from create_etf_regimes import create_regimes

etf_file = "./SPY_2018.csv"
vix_file = "./vixcurrent.csv"
regime_thresh = [(0, 20), (20, 100)]
create_regimes(etf_file, vix_file, regime_thresh, half_life=10)

We will use the notebook from the previous article to create synthetic prices that correspond to low and high volatility regimes. The figure below plots the reconstructed prices for each regime of the SPY ETF. The plots on the left show the reconstructed price series from true/observed returns, and the ones on the left show the synthetic prices for each regime. The procedure to select the best prices is similar to the one described in the previous article, and the 50 best prices with the lowest Wasserstein-1 distance are selected.

Optimizing the strategy parameters based on market regimes

This section deals with how to use the regime based synthetic prices to switch between appropriate parameters based on the market regime. Generally, even for an extremely robust strategy, a single set of parameters do not tend to perform consistently across all regimes. During periods of high uncertainty, strategies need to operate with parameters that are more sensitive to the changes in the market environment, and during low volatility periods, it’s generally better to have parameters that are less responsive to avoid getting trapped in small random market movements. The process of switching the strategy parameters can be automated by either making these parameters a function of market prices/volatility or use fixed rules to switch between these parameters.

Using only actual/observed price is generally not recommended to perform this exercise as we can easily overfit the parameters across different regimes as there is only one price path under consideration.

Backtesting across high and low volatility regimes using synthetic prices

Like before, we use Bollinger Breakout Strategy as our toy strategy in this exercise (we use Double EMA against SMA used in the previous article) and test its performance on high and low volatility regime synthetic prices. The code block below runs the Bollinger Breakout Strategy on high and low volatility regime prices and outputs the total returns generated by each parameter set in both regimes.

Note: For sake of simplicity transaction costs and slippages are not considered in these backtests.

def gen_bband_signals(df, lbk, band_dev):
    close_price = df.values
    u_band, l_band, m_band = ta.BBANDS(close_price, timeperiod=lbk, nbdevup=band_dev,
                                           nbdevdn=band_dev, matype=3)
    
    bb_signals = np.asarray(np.zeros(close_price.shape)).astype(float)
    for i in range(lbk, len(bb_signals) - 1):
        if close_price[i] > u_band[i]:
            bb_signals[i] = 1
        elif close_price[i] < u_band[i] and close_price[i] >= m_band[i] and bb_signals[i - 1] == 1:
            bb_signals[i] = 1
        elif close_price[i] < l_band[i]:
            bb_signals[i] = -1
        elif close_price[i] > l_band[i] and close_price[i] <= m_band[i] and bb_signals[i - 1] == -1:
            bb_signals[i] = -1
        else:
            bb_signals[i] = 0
    
    return pd.Series(bb_signals, index=df.index) 

import itertools
def param_gen(param_list1, param_list2):
    return [params for params in itertools.product(param_list1, param_list2)]

def run_param_simulation(prices_df, param_list1, param_list2):
    param_sets = param_gen(param_list1, param_list2)
    mean_ret_list, stdev_list, param_list = [], [], []
    pctile_list = []
    for params in param_sets:
        strat_signals = prices_df.apply(gen_bband_signals, args = [params[0], params[1]])
        strat_returns = prices_df.pct_change(1)
        strat_perf = strat_signals.shift(2)*strat_returns
        mean_ret_list.append(strat_perf.sum(axis=0).mean())
        stdev_list.append(strat_perf.sum(axis=0).std())
        param_list.append(f"lbk={params[0]}_band={params[1]}")
        pctile_list.append(np.percentile(strat_perf.sum(axis=0), 0.1))
        
    return pd.DataFrame({"params": param_list, "mean_return":mean_ret_list, "stdev_return":stdev_list, "10_percentile_return": pctile_list})

In this article, we have used a much broader parameter space to test the strategy parameters as we’re testing on two regimes separately.

Backtest results on Low Volatility Synthetic Prices

regime1_params_df = run_param_simulation(regime_1_price_df, [10, 20, 40, 50, 60, 80, 125], [1, 1.5, 2, 2.5, 3])
regime1_params_df.sort_values(by=["mean_return"], ascending=False).head(10)
===========================================================================

idx	    params	mean_return	stdev_return	10_percentile_return
lbk=40_band=2.5	 0.160987	 0.292706	   -0.566362
lbk=40_band=1.5	 0.157259	 0.267162	   -0.459452
lbk=50_band=2	 0.153451	 0.246426	   -0.564509
lbk=50_band=1.5	 0.153148	 0.284339	   -0.572467
lbk=80_band=1	 0.137037	 0.253735	   -0.488103
lbk=60_band=1.5	 0.136425	 0.291135	   -0.649798
lbk=40_band=2	 0.131608	 0.244813	   -0.697184
lbk=80_band=1.5	 0.127187	 0.274369	   -0.606843
lbk=50_band=1	 0.121313	 0.271381	   -0.508462
lbk=20_band=1.5	 0.115015	 0.248762	   -0.404729

Backtest results on High Volatility Synthetic Prices

The top 10 parameters are chosen by filtering out the parameters when the mean_return is not equal to zero, since the for the cases when lbk > length of the series, no trades will be executed.

regime2_params_df = run_param_simulation(regime_2_price_df, [10, 20, 40, 60, 80, 125], [1, 1.5, 2, 2.5, 3])
regime2_params_df[regime2_params_df["mean_return"]!=0].sort_values(by=["mean_return"], ascending=False).head(10)

======================================================================

idx         params	mean_return	stdev_return	10_percentile_return
lbk=20_band=1.5	 0.006048	 0.076144	   -0.219411
lbk=40_band=1	 0.005533	 0.035488	   -0.053195
lbk=40_band=1.5	 0.003112	 0.033241	   -0.052823
lbk=40_band=3	 0.000035	 0.033038	   -0.043573
lbk=40_band=2.5	 -0.000052	 0.033555	   -0.052804
lbk=40_band=2	 -0.000792	 0.033611	   -0.052804
lbk=20_band=2	 -0.002059	 0.062380	   -0.135197
lbk=10_band=2	 -0.012680	 0.069832	   -0.131234
lbk=20_band=2.5	 -0.013961	 0.057289	   -0.137812
lbk=20_band=1	 -0.014373	 0.083520	   -0.137102

Backtest results on Actual Prices for both regimes

# low volatility regime
act1_trunc_params_df = run_param_simulation(spy1_df["recon_Adj_Close"].to_frame(), [10, 20, 40, 60, 80, 125], [1, 1.5, 2, 2.5, 3])
act1_trunc_params_df.sort_values(by=["mean_return"], ascending=False).head(10)
=============================================================================


idx         params	mean_return	stdev_return	10_percentile_return
26	lbk=125_band=1.5 -0.009509	    NaN	            -0.009509
27	lbk=125_band=2	 -0.016795	    NaN	            -0.016795
0	lbk=10_band=1	 -0.087420	    NaN	            -0.087420
1	lbk=10_band=1.5	 -0.114886	    NaN	            -0.114886
5	lbk=20_band=1	 -0.161860	    NaN	            -0.161860
21	lbk=80_band=1.5	 -0.212766	    NaN	            -0.212766
22	lbk=80_band=2	 -0.255993	    NaN	            -0.255993
2	lbk=10_band=2	 -0.266329	    NaN	            -0.266329
3	lbk=10_band=2.5	 -0.266329	    NaN	            -0.266329
4	lbk=10_band=3	 -0.266329	    NaN	            -0.266329

act2_trunc_params_df = run_param_simulation(spy2_df["recon_Adj_Close"].to_frame(), [10, 20, 40, 60, 80, 125], [1, 1.5, 2, 2.5, 3])
act2_trunc_params_df.sort_values(by=["mean_return"], ascending=False).head(10)
==========================================================================

idx    	    params	mean_return	stdev_return	10_percentile_return
lbk=10_band=1.5	 0.036104	    NaN	            0.036104
lbk=40_band=1	 0.012375	    NaN	            0.012375
lbk=40_band=1.5	 0.012375	    NaN	            0.012375
lbk=40_band=2	 0.012375	    NaN	            0.012375
lbk=40_band=2.5	 0.012375	    NaN	            0.012375
lbk=40_band=3	 0.012375	    NaN	            0.012375
lbk=10_band=2	 -0.012616	    NaN	            -0.012616
lbk=10_band=2.5	 -0.012616	    NaN	            -0.012616
lbk=10_band=3	 -0.012616	    NaN	            -0.012616
lbk=20_band=1.5	 -0.026929	    NaN	            -0.02692

Chosing the best possible parameters from synthetic price and actual price backtests

The parameter selection criteria again are exactly the same as the one we defined in the previous article, where the best parameter based on the highest mean_return is chosen for actual price backtests, and an intersection based framework is used to select the best possible parameters for each regime.

Best parameter set based on backtest results on low volatility synthetic prices

Based on the intersection rule we found just 1 parameter candidate for low volatility regime, which we will use for the final Out-of-Sample test.

best_returns1 = regime1_params_df.sort_values(by=["mean_return"], ascending=False).head(10)["params"]
best_stdev1 = regime1_params_df.loc[regime1_params_df["mean_return"]>0].sort_values(by=["stdev_return"], ascending=True).head(10)["params"]
best_10_pct1 = regime1_params_df.loc[regime1_params_df["mean_return"]>0].sort_values(by=["10_percentile_return"], ascending=False).head(10)["params"]
best_params1 = list(set(best_returns1).intersection(set(best_stdev1)).intersection(set(best_10_pct1)))
best_params_perf1 = regime1_params_df.loc[regime1_params_df["params"].isin(best_params1)]
best_params_perf1
==========================================================================

idx          params	mean_return  stdev_return   10_percentile_return
6	lbk=20_band=1.5	 0.115015	0.248762	-0.404729

Best parameter set based on backtest results on high volatility synthetic prices

On the high volatility regime we just found 4 parameter set that had a positive return in the backtests the return profile seems insignificant due to the short length of the high volatility prices. Based on the table below we will choose the second parameter set i.e. lbk=40_band=1 as it had slighly lower mean return values but significantly less 10_percentile_return value than lbk=20_band=1.5 parameter

best_returns2 = regime2_params_df.sort_values(by=["mean_return"], ascending=False).head(10)["params"]
best_stdev2 = regime2_params_df.loc[regime2_params_df["mean_return"]>0].sort_values(by=["stdev_return"], ascending=True).head(10)["params"]
best_10_pct2 = regime2_params_df.loc[regime2_params_df["mean_return"]>0].sort_values(by=["10_percentile_return"], ascending=False).head(10)["params"]
best_params2 = list(set(best_returns2).intersection(set(best_stdev2)).intersection(set(best_10_pct2)))
best_params_perf2 = regime2_params_df.loc[regime2_params_df["params"].isin(best_params2)]
best_params_perf2
==========================================================================
idx         params	mean_return	stdev_return	10_percentile_return
6	lbk=20_band=1.5	 0.006048	 0.076144	    -0.219411
10	lbk=40_band=1	 0.005533	 0.035488	    -0.053195
11	lbk=40_band=1.5	 0.003112	 0.033241	    -0.052823
14	lbk=40_band=3	 0.000035	 0.033038	    -0.043573

Out-of-Sample performance of the dynamic parameter switching strategy

We test the out-of-sample performance on actual prices (2019-2020) using the high and low volatility parameters found using both actual prices and synthetic prices. To switch the parameters using the VIX regimes we will need to modify the bollinger breakout strategy function to accept 3 more parameters, which are lbk and band_dev for the second regime and the threshold for defining the VIX_EMA regimes. The code block below implements such a function which we will use to test the out-of-sample performance of parameter switching strategy.

def gen_bband_multi_regime_signals(df, lbk1, band_dev1, lbk2, band_dev2, regime_thresh):
    close_price = df["Adj_Close"].values
    vix_close_lag1 = df["VIX Close"].shift(1).values
    
    u_band_1, l_band_1, m_band_1 = ta.BBANDS(close_price, timeperiod=lbk1, nbdevup=band_dev1,
                                           nbdevdn=band_dev1, matype=3)
        
    u_band_2, l_band_2, m_band_2 = ta.BBANDS(close_price, timeperiod=lbk2, nbdevup=band_dev2,
                                           nbdevdn=band_dev2, matype=3)
    
    bb_signals = np.asarray(np.zeros(close_price.shape)).astype(float)

    for i in range(max(lbk1, lbk2), len(bb_signals) - 1):

        if vix_close_lag1[i] <= regime_thresh:

            if close_price[i] > u_band_1[i]:
                bb_signals[i] = 1
            elif close_price[i] < u_band_1[i] and close_price[i] >= m_band_1[i] and bb_signals[i - 1] == 1:
                bb_signals[i] = 1
            elif close_price[i] < l_band_1[i]:
                bb_signals[i] = -1
            elif close_price[i] > l_band_1[i] and close_price[i] <= m_band_1[i] and bb_signals[i - 1] == -1:
                bb_signals[i] = -1
            else:
                bb_signals[i] = 0
                
        elif vix_close_lag1[i] > regime_thresh:
            if close_price[i] > u_band_2[i]:
                bb_signals[i] = 1
            elif close_price[i] < u_band_2[i] and close_price[i] >= m_band_2[i] and bb_signals[i - 1] == 1:
                bb_signals[i] = 1
            elif close_price[i] < l_band_2[i]:
                bb_signals[i] = -1
            elif close_price[i] > l_band_2[i] and close_price[i] <= m_band_2[i] and bb_signals[i - 1] == -1:
                bb_signals[i] = -1
            else:
                bb_signals[i] = 0

    
    return pd.Series(bb_signals, index=df.index) 

The next step involves augmenting the dataframe containing the actual out-of-sample prices with the VIX_EWM prices. The code block below merges the two prices.

vix_file_path = "./vixcurrent.csv"
vix_df = pd.read_csv(vix_file_path)
vix_df["date"] = pd.to_datetime(vix_df["date"])
vix_df = vix_df.set_index("date")
impvol_ema_df = vix_df["VIX Close"].ewm(halflife=10).mean().to_frame()
act_os_vix_df =act_df_os.join(impvol_ema_df, how='left').ffill()

Out-of-Sample performance using parameters found using actual prices

We can now proceed with backtesting the dynamic parameter switching strategy on actual prices, the code blocks below calculate the out-of-sample backtest performance on the entire series as well as on individual volatility regimes.

note since all the parameter sets had a negative return profile for the low volatility period we will use extreme valued parameters such that no trades are taken during this period. i.e by using parameters lbk=250_band=200 we are certain with a high degree of confidence that the algorithm will not trade during the low volatility period

strat_signals_act = gen_bband_multi_regime_signals(act_os_vix_df, 250, 20, 10, 1.5, 20)
underlying_returns = act_os_vix_df["Adj_Close"].pct_change(1)
strat_perf_act = strat_signals_act.shift(2)*underlying_returns
regime_act_df = pd.DataFrame({"params": ["lbk=250/10_band=200/1.5"], "mean_return":strat_perf_act.sum(axis=0).mean(), "stdev_return":strat_perf_act.sum(axis=0).std(), "10_percentile_return": np.percentile(strat_perf_act.sum(axis=0), 0.1)})
regime_act_df
===========================================================================
idx	        params	        mean_return	stdev_return	10_percentile_return
0	lbk=250/10_band=200/1.5	 0.079288	    0.0	            0.079288

Performance on individual regimes is calculated below

strat_perf_act_df = strat_perf_act.to_frame()
strat_perf_act_df.columns = ["strat_returns"]
strat_perf_act_df = act_os_vix_df.join(strat_perf_act_df, how="left").fillna(0)
strat_perf_act_df["VIX Close Lag"] = strat_perf_act_df["VIX Close"].shift(1).fillna(0)
strat_act_reg_1 = strat_perf_act_df[strat_perf_act_df["VIX Close Lag"]<=20]
strat_act_reg_2 = strat_perf_act_df[strat_perf_act_df["VIX Close Lag"]>20]
print(f"strategy return in low volatility regime is: {strat_act_reg_1.strat_returns.sum()} and in high volatility regime is: {strat_act_reg_2.strat_returns.sum()}")
========================================================================================
"strategy return in low volatility regime is: 0.0
 and in high volatility regime is: 0.07928802522667144"

Out-of-Sample performance using parameters found using synthetic prices

Backtest performance on the Out-of-Sample actual prices using dynamic parameters dound using synthetic prices is,

strat_signals_syn = gen_bband_multi_regime_signals(act_os_vix_df, 20, 1.5, 40, 1, 20)
strat_perf_syn = strat_signals_syn.shift(2)*underlying_returns
regime_syn_df = pd.DataFrame({"params": ["lbk=20/40_band=1.5/1"], "mean_return":strat_perf_syn.sum(axis=0).mean(), "stdev_return":strat_perf_syn.sum(axis=0).std(), "10_percentile_return": np.percentile(strat_perf_syn.sum(axis=0), 0.1)})
regime_syn_df

============================================================================
idx            params	       mean_return	stdev_return	10_percentile_return
0	lbk=20/40_band=1.5/1	 0.378265	    0.0	            0.378265

and performance on individual volatility regimes is,

strat_perf_syn_df = strat_perf_syn.to_frame()
strat_perf_syn_df.columns = ["strat_returns"]
strat_perf_syn_df = act_os_vix_df.join(strat_perf_syn_df, how="left").fillna(0)
strat_perf_syn_df["VIX Close Lag"] = strat_perf_syn_df["VIX Close"].shift(1).fillna(0)
strat_syn_reg_1 = strat_perf_syn_df[strat_perf_syn_df["VIX Close Lag"]<=20]
strat_syn_reg_2 = strat_perf_syn_df[strat_perf_syn_df["VIX Close Lag"]>20]
print(f"strategy return in low volatility regime is: {strat_syn_reg_1.strat_returns.sum()} and in high volatility regime is: {strat_syn_reg_2.strat_returns.sum()}")
=============================================================================
"strategy return in low volatility regime is: 0.03793275450103961
 and in high volatility regime is: 0.340332731206179"

Conclusion and Future Work

Based on the results above it is clearly visible that strategy parameters chosen by synthetic scenarios perform well across different regimes, which is not the case when we use the parameters found on actual price scenarios. The synthetic parameters across low and high volatility outperform the ones on the actual parameters over the same regime.

param type\regime| High_vol  |   Low_vol    |
=================|===========|==============|
Act Price Params |   7.9%    |   0%         |
-----------------|-----------|--------------|
Syn Price Params |   34%     |    3.8%      |
-----------------|-----------|--------------|

The toy example details how one can tactically switch the strategy parameters of a strategy based on market regimes defined according to the VIX index. However, this is just one simple application of the synthetic prices.

Instead of just choosing the appropriate parameters for a given strategy based on market regimes we can even switch between different types of straetgies based on the market environment. i.e. tactically switching between momentum and mean-reversion strategies, or switching to appropriate tail strategies during adverse market conditions
One can even make synthetic prices that reflect custom scenarios, i.e. pre-election/post-election/pre-brexit/post-brexit scenarios.
We use these synthetic scenarios of risk management as well, by setting optimal leverage levels of a given portfolio or a strategy or to calculate Value at Risk measures for different scenarios

This article detailed just the univariate case of DGP, however we can generate multivariate synthetic prices as well by using

  -  Chloseky decomposition in case of MCMC based DGP
  -  Time-Series GAN for GAN DGP
  -  Deep Convolutional VAE for Varational Autoencoder DGP

The repo and notebook can be accessed here

References

TACTICAL INVESTMENT ALGORITHMS: Marcos López de Prado
Stochastic Volatility Model, https://docs.pymc.io/notebooks/stochastic_volatility.html
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo,
   https://arxiv.org/pdf/1111.4246.pdf
Probabilistic Programmming in Python using PYMC3, https://arxiv.org/pdf/1507.08050.pdf

Written on October 11, 2020