[ Agenda | Sessions | Program ]

The Self-Evolving Logic of Financial Claim Prices

Thomas A. Noe - Georgia State University


In this paper, we will price financial claims by allowing option pricing programs to evolve trough time via combining with each other, mutating randomly, and reproducing at rates based on the pressure of evolutionary selection. The specific technique we employ is Genetic Programming, an optimization technique based on the principles of natural selection. Compared to the traditional arbitrage-based approach, this technique is useful when the underlying asset dynamics are unknown or when the pricing equations are too complicated to solve analytically. Compared to other established data-driven option pricing techniques such as neural networks, implied binomial trees, etc., genetic programming has the advantage of not restricting the structure of the pricing formulas, formulae themselves evolve rather than simply the parameters of a single formula. Our analysis is preliminary. However, by showing that genetic programming can recover Black-Sholes formula from a fairly small data sample, we hope to validate the ability of genetic programming approaches to consistently and efficiently estimate option prices, at least in structurally simple environments. Future research will apply genetic programming approach to more intractable problems in derivative asset pricing.

The traditional approach to pricing a derivative security, pioneered by Black and Scholes (1973) and Merton(1973), is to specify a stochastic process for the underlying asset, derive the pricing equation for the derivative using the no-arbitrage argument, and solve the pricing equation to find the derivative price. Although this approach has enjoyed tremendous success both in practice and research, it may produce poor prices estimates in some cases. For example, the stochastic process for the underlying asset could be misspecified, the parameters for the process could be misestimated, or the pricing equation could be too complicated to solve analytically. The well-known volatility smile phenomenon illustrates the difficulty of fitting actual prices with simple pricing models. Booming derivative markets generate a large amount of data. This has lead a number of researchers to suggest data-driven approaches to pricing problems which extract pricing measures directly from observed data. In fixed income research, this is called the ``no-arbitrage approach.'' For equity options, Rubinstein (1994), Dupire (1994), Derman and Kani (1994) have provided methods to construct a discrete representation of the risk-neutral density that is exactly consistent with a set of observed option prices. A different data-driven approach is taken by Hutchinson, Lo and Poggio (1994). They utilize a radial-bias neural network algorithm to to price and hedge options.

These data-driven techniques have one common characteristic. They require the researcher to prespecify the underlying model structure and then estimate the models's parameters. One natural question is whether there is a way of pricing assets without prespecifing the pricing structure or the form of the relation between the underlying asset's and the derivative's price. In this project, we investigate such a method, genetic programming, for searching for the optimal pricing structure using prices of traded options, both real market prices and simulated prices based on traditional option pricing formulae. By applying the algorithms to simulated data sets, we can check the ability of genetic programs to generate consistent estimates in the presence of a well-defined stochastic structure. By applying the technique to real data, we can investigate whether genetic programming can solve some intractable problems related to fitting observed prices, such as the volatility smile.

Genetic programming was developed by Koza (1992) (1994). It shares many features with genetic algorithm which is developed by Holland (1962) (1976) as both are optimization techniques based on the principles of natural evolution. It works as follows: The environment consists of an initial population of randomly generated solution candidates. Initially, each candidate is evaluated according to a problem-specific function, fitness, which measures how successfully the program solves the problem. Next, the solution candidates are recombined through crossover and mutation to generate new solution candidates, their offspring. Crossover ``marries'' two computer programs to generate new programs. Mutation involves random modification of the code of an existing program. These newly generated programs are again evaluated and reproduce their offspring following the same selection process. This evolution process is stopped when some termination criteria is met.

In genetic programming, the solution candidates are represented by tree-like, hierarchical structured, computer programs. Each node of this tree is a function, or ``building block,'' which takes the outputs of its branch nodes as inputs and produce an output. If the function has no input, it is a terminal function which becomes a terminal node of the tree. The whole program is computed recursively from the terminal nodes to the root of the tree. We can represent the pricing formulas as tree-like programs. The size, shape and content of the trees can change dynamically during the evolution process. This is why genetic programming does not restrict the underlying structure of the solutions. The search space of genetic programming is huge: the space of all possible tree-like programs. To be effective in searching such a large and complex space, the algorithm not only conducts a parallel search of many points, through fitness evaluation of of an entire pool of programs, but also implicitly processes, in parallel, a large amount of useful information regarding the attributes in each solution program. The Darwinian fitness-proportionate reproduction allocates an exponentially increasing weight to those better attributes. This selection routine, when combined with spontaneous and dynamic emergence of new programs through crossover and mutation, leads to an optimal search in the stochastic environment by trading off, in a judicious fashion, the benefits of extensive search in a promising direction against the costs of ``overfocusing,'' and thus ignoring large regions of the search space.

More specifically, to find the pricing formulae of the derivatives, we treat the output of a program tree as the price of the derivative and compare this price to the market price. Approximating market prices will be the selection criteria. Although the resulting program would be a tree presentation of a complicated formula with many redundant and useless branches, studying such program can still shed light to the hidden (especially structural) relation between the asset and the derivative. To assess the potential value of genetic programming in derivative valuation and hedging, we will train and test the programs with both simulated data and real data. Following Hutchinson, Lo and Poggio (1994), we will simulate the option prices using Black-Scholes formula, and investigate whether genetic programming can generate programs to recover Black-Scholes option pricing formula the simulated date. In future research, we plan to purchase data sets of real option prices and apply genetic programming to generate pricing programs from this data. Comparison with other popular methods will be made to assess the merit of genetic programming. We are also interested in applying genetic programming to the estimation of the prices of interest-rate sensitive contingent claims.


Scheduled for Session 5.4 Computational Finance

[ Agenda | Sessions | Program ]