Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7821725
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T07:43:58+00:00 2026-06-02T07:43:58+00:00

I have a text file ~6GB which I need to parse and later persist.

  • 0

I have a text file ~6GB which I need to parse and later persist. By ‘parsing’ I’m reading a line from the file (usually 2000 chars), create a Car-object from the line and later I persist it.

I’m using a producer consumer pattern to parse and persist and wonder if it makes any difference (for performance reasons) to persist one object at a time or 1000 (or any other amount) in one commit?

At the moment, it takes me >2hr to persist everything (3 million lines) and it looks too much time for me (or I may be wrong).

Currently I’m doing this:

public void persistCar(Car car) throws Exception
{
    try
    {
        carDAO.beginTransaction();  //get hibernate session...

        //do all save here.

        carDAO.commitTransaction(); // commit the session

    }catch(Exception e)
    {
        carDAO.rollback();
        e.printStackTrace(); 
    }
    finally
    {
        carDAO.close();
    }
}

Before I make any design changes I was wondering if there’s a reason why this design is better (or not) and if so, what should be the cars.size()? Also, is open/close of session considered expensive?

public void persistCars(List<Car> cars) throws Exception
{
    try
    {
        carDAO.beginTransaction();  //get hibernate session...
        for (Car car : cars)    
        //do all save here.

        carDAO.commitTransaction(); // commit the session

    }catch(Exception e)
    {
        carDAO.rollback();
        e.printStackTrace(); 
    }
    finally
    {
        carDAO.close();
    }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T07:43:59+00:00Added an answer on June 2, 2026 at 7:43 am

    Traditionally hibernate does not go that well with bulk inserts. There are some ways to optimize it to some level.

    Take this example from the API Docs,

    Session session = sessionFactory.openSession();
    Transaction tx = session.beginTransaction();
    
    for ( int i=0; i<100000; i++ ) {
        Customer customer = new Customer(.....);
        session.save(customer);
        if ( i % 20 == 0 ) { //20, same as the JDBC batch size
            //flush a batch of inserts and release memory:
            session.flush();
            session.clear();
        }
    }
    
    tx.commit();
    session.close();
    

    In the above example the session if flushed after inserting 20 entries which will make the operation little faster.

    Here an interesting article discussing the same stuff.

    We have successfully implemented an alternative way of bulk inserts using stored procedures. In this case you will pass the parameters to the SP as “|” separated list, and will write the insert scrips inside the SP. Here the code might look a bit complex but is very effective.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have text file from which I need to get data by line by
I have a text file which have lots of lines I have a line
I have text files which I need to remove stop words from them. I
I have a 100 GB text file, which is a BCP dump from a
I need to load from text file text which I will be replacing in
I have text file with something like first line line nr 2 line three
I have text file with some text information and i need to split this
I have a text file, and I need to print it to a specific
I have text file with several thousands lines. I want to parse this file
I have text file which I want to erase in Python. How do I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.