I have written a very complex database migration script in Groovy, that runs just fine on my workstation but produces “Caught: java.lang.OutOfMemoryError: Java heap space” when run on the server’s JVM. JVM is stuck as is (limited resources as an intern), so I need to figure out another way to fix this besides increasing available memory.
The error strikes when some of the largest tables are accessed: a particularly large, but simple, join (200,000+ rows to 50,000+ rows). Is there another way I can approach such a join that will save me from the error?
Example of query:
target.query("""
SELECT
a.*, b.neededColumn
FROM
bigTable a JOIN mediumTable b ON
a.stuff = b.stuff
ORDER BY stuff DESC
""") { ResultSet rs ->
...
}
Can you run the join in SQL on the database server?
If not, you’re probably stuck with iterating through each of your 200,000 results joining it to the 50,000 rows and writing out the results (so you aren’t storing more than 1*50,000 results in memory at any one time)
Or, if you have access to multiple machines, you could divide your 200,000 items into blocks and do one block per machine?
Edit
Taking your example code, you should be able to do:
That will write each row out to the file
output.csv