Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7071883
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T05:44:29+00:00 2026-05-28T05:44:29+00:00

I am facing a certain behavior using Amazon EC2 and Java that it’s being

  • 0

I am facing a certain behavior using Amazon EC2 and Java that it’s being hard to correctly understand. What I have is a code that uses iText to split a single, multi-page PDF file into many files (one file per page). I have about 1 million pages to extract (around 2500 source files), and thus I am doing tests on EC2 to determine which setup will work best for such job.

I have made a small application (link below) that either processes each source file sequentially, without starting any worker thread, and which also can perform the same task using Java threading via Executors.

On my local Macbook Pro the threaded version runs around 30~40% faster than the sequential one, but on every single EC2 instance that I tried, the threaded version performed much worst than the sequential run.

I tried with a small instance, a large and a high-cpu extra large. What I am trying to understand is what could cause such bad results for the threaded version; if it is something with my code, or I/O at EC2, or simply that for this specific task threads are indeed a bad choice? I am accepting any kind of clue.

The relevant code is here: https://gist.github.com/1641643 (sorry for the “flag oriented programming”, it was just easier to switch between the tests). I tried different values for Executors.newFixedThreadPool (2, 4, 8 etc…) without any significant changes in the results.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T05:44:30+00:00Added an answer on May 28, 2026 at 5:44 am

    Wild guess, but if all the threads read and write to a single hard disk, it forces the disk to constantly change the location of the reads and writes. Whereas in the single-threaded approach, the thread can read the wole input file at once, and write the result at once.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a public facing website that has been receiving a number of SQL
I am developing a binary search tree in java. But i am facing certain
I have an ArrayList of type String that contain certain values. I want to
I'm looking for some input for a challenge that I'm currently facing. I have
Im facing a problem where i want to schedule a certain java application to
I have a DIV element that appears when a certain action is performed. For
I have a web service that is externally facing but I would like it
I have a cron task that runs once a day, using Heroku's Daily Cron
just had a general question about how to approach a certain problem I'm facing.
Here's a sample of the scenario I'm facing. Say I have this column family:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.