The terms “machine learning” and “artificial intelligence” are currently being bandied about by the press, in marketing materials, and by industry influencers as if the Jetsons era is finally upon us. Although my self-driving car and robotic butler have yet to arrive (sorry, Alexa!), the future is now in the sense that machine learning is a powerful tool for solving today’s most difficult big data and quantitative modeling challenges, and will become more so in tomorrow’s world as the field matures.
Many of the techniques now called “machine learning” used to fall under the heading of “predictive analytics” (or simply “statistics”, depending on when you went to school and who your professors were). As computing power catches up with the theory, the lines between these techniques have blurred, with the machine itself playing a leading role. Regardless of the terminology used, machine learning possesses a few key attributes:
> Machine learning typically processes large quantities of data.
> The emphasis is on prediction and classification over comprehension.
> The systems being modeled are highly complex and often non-linear.
> The algorithms are dynamic and adaptive.
As a machine learning model consumes new data, it revises the state of the model – even the structure of the model itself – adapting to new conditions and improving its ability to predict as it goes – in much the same way that humans learn. Human engineers select the machine-learning framework suited for the problem (based on experience and business requirements), shape the models, refine them, constrain them, and train them. Then the engineers let the models do their thing. The result is a balance between human insight and algorithmic predictive power. The resulting models can drive a car, play Go, or select a movie for you on Netflix.
What does any of this have to do with Hanweck’s analytics? Quite a bit, actually. Our core competencies lie at the intersection of quantitative finance and computer science. Our primary product offering employs sophisticated statistical models underlying the latest quantitative techniques, processes vast quantities of real-time data from the derivatives markets, with an imperative for real-time analytics in a market that moves close to the speed of light. Machine learning – properly managed – is one of the tools that enables us to deliver.
For example, a workhorse machine learning technique that we use quite often is Dynamic Bayesian Modeling (DBM). DBM is a machine-learning framework that allows us to handle noisy – and possibly incomplete – data sets, efficiently estimate (possibly non-linear) market model parameters, adapt to changing market conditions, and make accurate predictions. For instance, options market data is noisy: bid/ask spreads can be very wide, bids or asks can disappear completely, trades can get canceled, exchanges can put out bad data, etc. DBM allows us to incorporate the history of observed noisy, incomplete data into a structured model of the “true” price of an option. The model considers where the option itself was priced and where related options were priced and develops an expectation of what the correct price should be. That expectation is then used to derive a more accurate theoretical value for the option than, say, the mid-market price. Having a more accurate theoretical value improves our estimates of risk sensitivities (the Greeks), implied volatility surfaces, stock borrow-loan rates and P/L estimates.
DBM comes in many flavors. A very powerful DBM is the Kalman filter. Although the classical Kalman filter is a linear model, it has been extended to handle non-linear problems, such as those encountered in derivatives pricing. Extensions include the Extended Kalman Filter, the Unscented Kalman Filter and Particle Filters. Kalman filters are a helpful tool for separating signal from noise in the high dimension, high complexity problems encountered in options analysis. They produce a dynamic estimation of parameters, such as implied volatility, for options pricing models from the market data which separates out signal from noise. The model monitors the state between variables and adjusts itself through time to produce the theoretical surfaces. For instance, we use Unscented Kalman filters for fitting and estimating the parameters of volatility surface models dynamically, in real time, from noisy observed market data.
Whether you prefer Unscented Kalman Filters or the heavily perfumed variety is a topic for another day. Suffice it to say that machine learning is no longer a Futurama vision of what might be, but an important tool for quantitative financial modeling here and now.