Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5990623
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T23:16:14+00:00 2026-05-22T23:16:14+00:00

I’ve created a program that parses data from a file and imports it into

  • 0

I’ve created a program that parses data from a file and imports it into a relational postgresql database.
The program has been running for 2 weeks and looks like it has a few more days left. It is averaging ~150 imports a second. How can I find the limiting factor and make it go faster?
The CPU for my program does not go above 10%, The Memory does not go above 7%.
The Postgresql database CPU does not go above 10%, and 25% Memory.

I’m guessing that the limiting factor is the hard-disk write speed, but how can I verify, and if the case; improve it? (short of buying a faster hard drive)

This is the output of “iostat -d -x”:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.59     3.23    0.61    1.55    23.15    38.37    28.50     0.01    5.76   1.04   0.22
sdb               0.02   308.37   21.72  214.53   706.41  4183.68    20.70     0.56    2.38   2.24  52.89

As you can likely guess, the database is on sdb.

EDIT: The file I am parsing is ~7GB.
For most (but not all) of the data in the file I go line by line, here is an example”

  1. Return the ID of partA in tableA.
    • If ID does not exist insert partA into tableA returning ID
  2. Return the ID of partB in tableB.
    • If ID does not exist insert partB into tableB returning ID
  3. Return the ID of the many-to-many relationship of partA and partB.
    • If the ID of the relationship does not exist create it.
    • else update the relationship (with a date id)
  4. move onto the next line.

To save many queries, I save the IDs of inserted PartA and PartB items in memory to reduce lookups.

Here is a copy of my postgresql.conf file: http://pastebin.com/y9Ld2EGz The only things I changed where the default data directory, and the memory limits.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T23:16:15+00:00Added an answer on May 22, 2026 at 11:16 pm

    You should have killed this process many, many days ago and asked not how to find the limiting factor of the app but rather:

    Is there a faster way to import data
    into pgSQL?

    Take a look at this question and then the pgSQL COPY documentation. It’s possible that the import process you’re running could be achieved in hours rather than weeks with the proper tools.

    By the way, regardless of what RDBMS you’re using, programmatic insertions of data are never as performant as native tools provided by the RDBMS’s vendor to handle bulk operations such as these. For example: SQL Server has bcp, DTS/SSIS and a few other options for bulk data import/export. Oracle has its own etc.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
Basically, what I'm trying to create is a page of div tags, each has
Does anyone know how can I replace this 2 symbol below from the string
this is what i have right now Drawing an RSS feed into the php,
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I have just tried to save a simple *.rtf file with some websites and
I have a bunch of posts stored in text files formatted in yaml/textile (from
I have some data like this: 1 2 3 4 5 9 2 6

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.