For a thick-client project I’m working on, I have to remotely connect to a database (IBM i-series) and perfom a number of SQL related tasks:
Download/Update a set of local/offline 'control' data– this data may have changed between runs unnoticed.On command, download data from multiple (15-20) tables and store separately into a single Java object.The names of the tables are known, but the schema name changes between runs and can change inter-run (as far as I know,PreparedStatementsdo not allow one to dynamically insert the schema).
I had considered using joins/unions/etc to perform all of these queries as one, but the project requires me to have in-memory separations between table data (instead of one big joined lump).Perform between 2 and 100+ repetitions of (2)
The last factor is that this needs to be run on high-latency (potentially dial-up) network connections using Java 1.5 on the oldest computers possible.
Currently I run 15-20 dynamically constructed PreparedStatements but I know this to be rather inefficient (I measured, so as to avoid premature optimization ala Knuth).
What would be the most efficient and error-tolerant method of performing these tasks?
My thoughts:
- Regarding
(1), I really have no idea other than checking the entire table against the new table, at which point I feel I might as well just download the new (potentially and likely unchanged) table and replace the old one, but this takes more time. - For
(2): Ideally I’d be able to construct something similar to an array ofSELECTstatements, send them all at once, and have the database return oneResultSetper internal query. From what I understand, however, neitherStatementnorPreparedStatementsupport returning multipleResultSetobjects. - Lastly, the best way I can think of doing
(3)is to batch a number of(2)operations.
There is nothing special about having moving requirements, but the single most important thing to use when talking to most databases is having a connection pool in your Java application and use it properly.
This also applies here. The IBM i DB2/400 database is quite fast, and the database driver available in the jt400 project (type 4, no native code) is quite good, so you can pull over quite a bit of data in a short while simply by generating SQL on the fly.
Note that if you only have a single schema you can tell in the conneciton which one you need, and can then use non-qualified table names in your SQL statements. Read the JDBC properties in the InfoCenter very carefully – it is a bit tricky to get right. If you need multiple schemaes, the “naming=system” allows for library lists – i.e. a list of schemaes to look for the tables, which can be very useful when done correctly. The IBM i folks can help you here.
That said, if the connection is the limiting factor, you might have a very strong case for running the “create object from tables” Java code directly on the IBM i. You should already now prepare for being able to measure the traffic to the database – either with network monitoring tooling, using p6spy or simply going through a proxy (perhaps even a throtteling one)