Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8087055
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T18:39:05+00:00 2026-06-05T18:39:05+00:00

So I have a bit of a performance problem. I have made a java

  • 0

So I have a bit of a performance problem. I have made a java program that constructs a database. The problem is when loading in the data. I am loading in 5,000 files into a sql Database. When the program starts off, it can process about 10% of the files in 10 minutes however it gets much slower as it progresses. Currently at 28% it is going to finish in 16 hours at its current rate. However that rate is slowing down considerably.

My question is why does the program get progressively slower as it runs and how to fix that.

EDIT: I have two versions. One is threaded (capped at 5 threads) and one is not. The difference between the two is negligible. I can post the code again if any one likes, but I took it out because I am now fairly certain that the bottle neck is the MySQL (Also appropriately re tagged). I went ahead and used batch inserts. This did cause an initial increase in speed but once again after processing about 30% of the data it does drop of quickly.

So SQL Points

  1. My Engine for all 64 tables is InnoDB version 10.
  2. The table have about 300k rows at this point (~30% of the data)
  3. All tables have one “joint” primary key. A id and a date.
  4. Looking at MySQL WorkBench I see that there is a query per thread (5 queries)
  5. I am not sure the unit of time (Just reading from MySQL Administrator), but the queries to check if a file is already inserted are taking `300. (This query should be fast as it is a SELECT MyIndex from MyTable Limit 1 to 1 where Date = date.) As I have been starting and stopping the program I built in this check to see if the file was already inserted. That way I am able to start it after each change and see what if any improvement there is without starting the process again.
  6. I am fairly certain that the degradation of preformance is related to the tables’ sizes. (I can stop and start the program now and the process remains slow. It is only when the tables are small that the process is going at an acceptable speed.)
  7. Please, please ask and I will post what ever information you need.

DONE! Well I just let it run for the 4 Days it needed to. Thank you all for the help.

Cheers,

–Orlan

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T18:39:07+00:00Added an answer on June 5, 2026 at 6:39 pm

    Q1: Why does the program get progressively slower?

    In your problem space, you have 2 systems interacting: a producer that reads from the file system and produces data, and a consumer that transforms that data into records and stores them in the db. Your code is currently hard linking these two processes and your system works at the slowest speed of the two.

    In your program you have a fixed arrival rate (1/sec – the wait when you’ve more than 10 threads running). If you have indexes in the tables being filled, as the table grows bigger, inserts will take longer. That means that while your arrival rate is fixed at 1/sec, your exit rate is continuosly increasing. Therefore, you will be creating more and more threads that share the same CPU/IO resources and getting less things done per unit of time. Creating threads is also a very expensive operation.

    Q2: Could it have to do with how I am constructing the queries from Strings?

    Only partially. Your string manipulation is a fixed cost in the system. It increases the cost it takes to service one request. But string operations are CPU bounded and your problem is I/O bounded, meaning that improving the string handling (that you should) will only marginally improve the performance of the system. (See Amdahl’s Law).

    Q3: how to fix that (performance issue)

    • Separate the file reader process from the db insert process. See the Consumer-Producer pattern. See also Completion Service for an implementation built-in the JDK:

    (FileReaderProducer) –> queue –> (DBBulkInsertConsumer)

    • Don’t create new Threads. Use the facilities provided by the java.util.concurrent package, like the executor service or the Completion service mentioned above. For a “bare” threadpool, use the Executors factory.

    • For this specific proble, having 2 separate thread pools, (one for the consumer, one for the producer) will allow you to tune your system for best performance. File reading improves with parallelization (up to your I/O bound), but db inserts are not (I/O + indexes + relational consistency checks), so you might need to limit the amount of file reading threads (3-5) to match the insertion rate (2-3). You can monitor the queue size to evaluate your system performance.

    • Use JDBC bulk inserts: http://viralpatel.net/blogs/batch-insert-in-java-jdbc/
    • Use StringBuilder instead of String concatenation. Strings in Java are immutable. That means that every time you do: myString += ","; you are creating a new String and making the old String elegible for garbage collection. In turn, this increases garbage collection performance penalties.
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a bit of computation that is somewhat expensive (starting a database), and
I have some performance-heavy code that performs bit manipulations. It can be reduced to
Recent events on the blogosphere have indicated that a possible performance problem with Scala
I have bit of a problem with understanding PHYSICAL_ADDRESS structure in WDK. I thought
i have bit of code that causes an underflow: var t1, t2, delta: DWORD:
I have a bit of a problem with my Google Maps application. So I'm
I have a bit of code that formats an editing page depending on how
I have a bit of code that tells you when a contact would not
I have a bit of code that concatenates text from some input boxes and
I have a bit map image that I will generate during runtime. I do

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.