I’m reading up on non-blocking I/O as I’m using Akka and Play and blocking is a bad idea if avoidable in that context as far as I can read, but I can’t get this to work together with my use case:
- Get file over network (here alternatives using nio exist, but right now I’m using URL.openStream)
- Decrypt file (PGP) using BouncyCastle (here I’m limited to InputStream)
- Unzip file using standard Java GZIP (limited to InputStream)
- Read each line in file, which is a position based flat file, and convert to a Case Classes (here I have no constraints on method for reading, right now scalax.io.Resource)
- Persist using Slick/JDBC (Not sure if JDBC is blocking or not)
It’s working right now basically using InputStreams all the way. However, in the interest of learning and improving my understanding, I’m investigating if I could do this with nonblocking IO.
I’d basically like to stream the file through a pipeline where I apply each step above and finally persist the data without blocking.
If code is required I can easily provide, but I’m looking of a solution on a general level: what do I do when I’m dependent on libraries using java.io?
I hope this helps with some of your points:
1/2/3/4) Akka can work well with libraries that use
java.io.InputStreamandjava.io.OutputStream. See this page, specifically this section: http://doc.akka.io/docs/akka/snapshot/scala/io.html1) You say get a file over the network. I’m guessing via HTTP? You could look into an asynchronous HTTP library. There are many fairly mature async HTTP libraries out there. I like using Spray Client in scala as it is built on top of akka, so plays well in an akka environment. It supports GZIP, but not PGP.
4) Another option: Is the file small enough to store in memory? If so you need not worry about being asynchronous as you will not be doing any IO. You will not be blocking whilst waiting for IO, you will instead be constantly using the CPU as memory is fast.
5) JDBC is blocking. You call a method with the SQL query as the argument, and the return type is a result set with the data. The method must block whilst performing the IO to be able to return this data.
There are some Java async database drivers, but all the ones I have seen seem unmaintained, so I have’t used them.
Fear not. Read this section of the akka docs for how to deal with blocking libraries in an akka environment:
http://doc.akka.io/docs/akka/snapshot/general/actor-systems.html#Blocking_Needs_Careful_Management