Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9001525
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T00:26:54+00:00 2026-06-16T00:26:54+00:00

On my setup, PostgreSQL 9.2.2 seems to error out when trying to load a

  • 0

On my setup, PostgreSQL 9.2.2 seems to error out when trying to load a large csv file into a table.

The size of the csv file is ~9GB

Here’s the SQL statement I’m using to do the bulk load:

copy chunksBase (chunkId, Id, chunk, chunkType) from path-to-csv.csv' delimiters ',' csv

Here’s the error I get after a few minutes:

pg.ProgrammingError: ERROR:  out of memory
DETAIL:  Cannot enlarge string buffer containing 1073723635 bytes by 65536 more bytes.
CONTEXT:  COPY chunksbase, line 47680536

I think that the buffer can’t allocate more than exactly 1GB, which makes me think that this could be a postgresql.conf issue.

Here’s the uncommented lines in postgresql.conf:

bash-3.2# cat postgresql.conf | perl -pe 's/^[ \t]*//' | grep -v '^#' | sed '/^$/d'
log_timezone = 'US/Central'
datestyle = 'iso, mdy'
timezone = 'US/Central'
lc_messages = 'en_US.UTF-8'         # locale for system error message
lc_monetary = 'en_US.UTF-8'         # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'          # locale for number formatting
lc_time = 'en_US.UTF-8'             # locale for time formatting
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 50 # pgtune wizard 2012-12-02
maintenance_work_mem = 768MB # pgtune wizard 2012-12-02
constraint_exclusion = on # pgtune wizard 2012-12-02
checkpoint_completion_target = 0.9 # pgtune wizard 2012-12-02
effective_cache_size = 9GB # pgtune wizard 2012-12-02
work_mem = 72MB # pgtune wizard 2012-12-02
wal_buffers = 8MB # pgtune wizard 2012-12-02
checkpoint_segments = 16 # pgtune wizard 2012-12-02
shared_buffers = 3GB # pgtune wizard 2012-12-02
max_connections = 80 # pgtune wizard 2012-12-02
bash-3.2# 

Nothing that explicitly sets a buffer to 1GB.

What’s going on here? Even if the solution is to increase a buffer in postgresql.conf, why is postgres seeming to try and bulk load an entire csv file into ram on the single copy call? One would think that loading large csv files is a common task; I can’t be the first person to come across this problem; so I would figure that postgres would have handled chunking the bulk load so that the buffer limit was never reached in the first place.

As a workaround, I’m splitting the csv into smaller files, and then calling copy for each file. This seems to be working fine. But it’s not a particularly satisfying solution, because now I have to maintain split versions of each large csv that I want to load into postgres. There has to be a more proper way to bulk load a large csv file into postgres.

EDIT1: I am in the process of making sure that the csv file is not malformed in any way. I’m doing this by trying to load all split csv files into postgres. If all can be loaded, then this indicates that the issue here is not likely due to the csv file being malformed. I’ve already found a few issues. Not sure yet if these issues are causing the string buffer error when trying to load the large csv.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T00:26:55+00:00Added an answer on June 16, 2026 at 12:26 am

    It turned out to be a malformed csv file.

    I split the large csv into smaller chunks (each with 1 million rows) and started loading each one into postgres.

    I started getting more informative errors:

    pg.ProgrammingError: ERROR:  invalid byte sequence for encoding "UTF8": 0x00
    CONTEXT:  COPY chunksbase, line 15320779
    
    pg.ProgrammingError: ERROR:  invalid byte sequence for encoding "UTF8": 0xe9 0xae 0x22
    CONTEXT:  COPY chunksbase, line 369513
    
    pg.ProgrammingError: ERROR:  invalid byte sequence for encoding "UTF8": 0xed 0xaf 0x80
    CONTEXT:  COPY chunksbase, line 16602
    

    There were a total of 5 rows with invalid utf8 byte sequences, out of a few hundred million. After removing those rows, the large 9GB csv loaded just fine.

    It would have been nice to get the invalid byte sequence errors when loading the large file initially. But at least they appeared once I started isolating the problem.

    Note that the line number mentioned in the error when loading the large file initially, had no relation with the encoding errors that were found when loading the smaller csv subset files. The initial line number was the point in the file where exactly 1GB of data took place, so it was related to the 1GB buffer allocation error. But, that error had nothing to do with the real problem…

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to setup python fabric for postgresql, but can't figure out how to
I am trying to setup PostgreSQL and allow only certain Windows users to access
I am trying to make a self-managing partition table setup with Postgres. It all
I'm trying to setup a Postgres trigger in my Hibernate mapping file, but it
I'm trying to test a small PostgreSQL setup, so I cobbled together a quick
I'm trying to get PostgreSQL to work with psycopg2, and I have run into
I need to setup my PostgreSQL DB's text encoding to handle non-American English characters
setup.py from distutils.core import setup import py2exe setup(console=['program.py']) The error Traceback (most recent call
Setup iMac 10.7.4, Eclipse Indigo, Pydev, Python 2.7 I’m trying to install the Tweepy
Setup iMac OSX 10.7.4, Hi there, I am trying to install Homebrew on my

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.