Optimization of Trading Systems and Portfolios
John Moody and Lizhong Wu - Oregon Graduate Institute of Science & Technology
We propose to train trading systems and portfolios by optimizing objective
functions that directly measure trading and investment performance. Rather
than basing a trading system on forecasts or training via a supervised
learning algorithm using labeled trading data, we train our systems using
recurrent reinforcement learning algorithms. The objective functions that
we consider as evaluation functions for reinforcement learning are profit or
wealth, economic utility, the Sharpe ratio, and our proposed Differential
Sharpe Ratio. The trading and portfolio management systems require prior
decisions as input in order to properly take into account the effects of
transactions costs, market impact, and taxes. This temporal dependence on
system state requires the use of reinforcement versions of standard
recurrent learning algorithms. We present empirical results in controlled
experiments that demonstrate the efficacy of some of our methods. We find
that maximizing the differential Sharpe ratio yields more consistent results
than maximizing profits, and that both methods outperform a trading system
based on forecasts that minimize MSE.
Scheduled for Session 3.4 Financial Models - II