I hope this does not get closed because it is related to algorithms that I have not been able to figure out(its also pretty long because I’m so confused about how its being done). Basically many years back I used to work at a mutual fund and we used different tools to select optimize portfolios as well as hedge existing ones. We would take these results and make our own modifications then sell them to clients. After my company downsized, I decided I wanted to give it a try(to create the software and include my customizations) but I have no clue how combinations are actually generated for the software.
After 6 months of trying, I’m accepting that my approach is impossible. I was trying to use combination algorithms like from Knuth’s book, as well as doing bit combinations to try to find every possible portfolio(I limited it to 30 stocks) on the NYSE(5,000+ stocks). But as per everyone I have spoken to this will take me billions of billions of years to just get one days results(for me on a GPU i stopped it after 2 days of straight processing).
So what am I missing? We would enter our risk tolerance and view of the market(stock market growth expectations, inflation expectations, fed funds expectations,etc..) and it would give us the ideal portfolio(in theory..) within a few seconds/minutes. With thousands of possibilities and quadrillion possible combinations of weights of stocks, how are they able to calculate results so quickly(or even at all)? As the admin of the system, I know we downloaded a file everyday(less than 100 mb and loaded in a mssql database probably just market data..so its not like we had every possibility. Using my approach above I would get a 5 gig file in a min of doing my version of Knuth’s combination algo) and the applications worked offline(so it must have been doing it locally on the desktop/laptop cpu not on a massive supercomputer somewhere and took a min or two to run..15 minutes was the longest for a global fund which includes every stock in the world). Its so confusing because their work required correlation of the entire fund(I don’t think they were just sending the top stocks they pre-calculated because everyone got different results). So if I wanted a 30 stock fund that gave me 2% returns and had a negative correlation with the market, and was 60% hedged how could the software generate that portfolio out of billions of possibilities so quickly? note, I’m not asking about the math or the finance part, I’m asking how it was able to generate 30 stocks from the entire market that gave 2% returns when in order to do that it would need to know the returns of all 30 stock portfolio(That alone would make it run for billions of years, right? the other restrictions make it more complex).
So How is this being done programmatically? I’m starting to believe they are not using Knuth’s combination algorithm to generate every possibility yet their results don’t seem randomly selected and individually selecting the stocks seems to miss the correlation part. How can so many investment softwares do things like this?
Such algorithms almost certainly don’t generate every possibility – as you rightly observe that would be impractical.
Portfolio selection is however very easy to do with other techniques that will give you a very good answer. The two most likely are:
Personally, I’d probably suggest the genetic algorithm approach – although it’s not as mathematically pure, it will give you good answers and should able to handle any constraints you want to throw at it quite easily (e.g. max number of stocks in a portfolio)