I just got in charge of an application whose goal is to extract a large amount of data (up to 100,000 rows from a table containing 10,000,000 rows). Unfortunately, the extraction is written in Java + Hibernate and the performance is relatively poor. Extraction of 100,000 rows using Java + Hibernate takes approximately 1 minute and 30 seconds. Same extraction using Talend takes approximately 30 seconds (3 times less).
Here is a sample of what the code looks like:
Launcher.initStatelessSession();
Launcher.beginStatelessTransaction();
//Creation of the Criteria crit, no join, only a single table is read.
int fetchSize = 1000;
crit.setFetchSize(fetchSize);
crit.setCacheable(false);
crit.setReadOnly(true);
ScrollableResults result = crit.scroll(ScrollMode.FORWARD_ONLY);
// Most of the time is spent from HERE ...
while (result.next()) {
// Some code but insignificant time compared to the result.next().
// I replaced this code with continue; and the speed did not really change.
}
// ... to HERE
Any idea on optimizations that could speed up this query? At the moment, there is no plan to abandon Hibernate for something else.
I don’t know what talend is, but I suspect it to be some kind of database gui tool?
In this case what possibly might be the reason is that hibernate as to dehydrate the objects, i.e. checking that the retrieved object isn’t yet in the session, create an instance and fill all properties (possibly with other referenced entities).
Use a profiler to find out what it is actually going on in greater detail
all this assumes that you actually executing the same sql statement. As mentioned in the comments, depending on your criteria and your mapping hibernate might create very ‘interesting’ select statements.