I’m trying to implement a pattern I read from Don Syme’s blog
which suggests that there are opportunities for massive performance improvements from leveraging asynchronous I/O. I am currently trying to take a piece of code that “works” one way, using Array.Parallel.Map, and see if I can somehow achieve the same result using Async.Parallel, but I really don’t understand Async.Parallel, and cannot get anything to work.
I have a piece of code (simplified below to illustrate the point) that successfully retrieves an array of data for one cusip. (A price series, for example)
let getStockData cusip =
let D = DataProvider()
let arr = D.GetPriceSeries(cusip)
return arr
let data = Array.Parallel.map (fun x -> getStockData x) stockCusips
So this approach contructs an array of arrays, by making a connection over the internet to my data vendor for each stock (which could be as many as 3000) and returns me an array of arrays (1 per stock, with a price series for each one). I admittedly don’t understand what goes on underneath Array.Parallel.map, but am wondering if this is a scenario where there are resources wasted under the hood, and it actually could be faster using asynchronous I/O? So to test this out, I have attempted to make this function using asyncs, and I think that the function below follows the pattern in Don Syme’s article using the URLs, but it won’t compile with “let!”.
let getStockDataAsync cusip =
async { let D = DataProvider()
let! arr = D.GetData(cusip)
return arr
}
The error I get is:
This expression was expected to have type Async<‘a> but here has type obj
It compiles fine with “let” instead of “let!”, but I had thought the whole point was that you need the exclamation point in order for the command to run without blocking a thread.
So the first question really is, what’s wrong with my syntax above, in getStockDataAsync, and then at a higher level, can anyone offer some additional insight about asychronous I/O and whether the scenario I have presented would benefit from it, making it potentially much, much faster than Array.Parallel.map? Thanks so much.
F# asynchronous workflows allow you to implement asynchronous computations, however, F# makes a distinction between usual computation and asynchronous computations. This difference is tracked by the type-system. For example a method that downloads web page and is synchronous has a type
string -> string(taking URL and returning HTML), but a method that does the same thing asynchronously has a typestring -> Async<string>. In theasyncblock, you can uselet!to call asynchronous operations, but all other (standard synchronous) methods have to be called usinglet. Now, the problem with your example is that theGetDataoperation is ordinary synchronous method, so you cannot invoke it withlet!.In the typical F# scenario, if you want to make the
GetDatamember asynchronous, you’ll need to implement it using an asynchronous workflow, so you’ll also need to wrap it in theasyncblock. At some point, you will reach a location where you really need to run some primitive operation asynchronously (for example, downloading data from a web site). F# provides several primitive asynchronous operations that you can call fromasyncblock usinglet!such asAsyncGetResponse(which is an asynchronous version ofGetResponsemethod). So, in yourGetDatamethod, you’ll for example write something like this:The summary is that you need to identify some primitive asynchronous operations (such as waiting for the web server or for the file system), use primitive asynchronous operations at that point and wrap all the code that uses these operations in
asyncblocks. If there are no primitive operations that could be run asynchronously, then your code is CPU-bound and you can just useParallel.map.I hope this helps you understand how F# asynchronous workflows work. For more information, you can for example take a look at Don Syme’s blog post, series about asynchronous programming by Robert Pickering, or my F# web cast.