Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 570305
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T13:23:14+00:00 2026-05-13T13:23:14+00:00

I’m parsing text documents and inserting them into PostgreSQL DB. My code is written

  • 0

I’m parsing text documents and inserting them into PostgreSQL DB. My code is written in Java and I use JDBC for DB connectivity. I have encountered very strange error when adding data to DB – it seems that at unpredictable moment (different number of iteration of main loop) Postgres does not see rows just added to tables and fails to perform updates properly.

Maybe I am doing something wrong, so maybe there is a way to correct my code? Or is it serious error of PostgreSQL and I should post it at PostgreSQL home page (as bug report)?

Here are the details of what I’m doing and what is going wrong. I’ve simplified my code to isolate the error – simplified version doesn’t parse any text but I simulate it with generated words. Source files are included (java and sql) at the end of my question.

In simplified example of my problem I have one-threaded code, one JDBC Connection, 3 tables and few SQL statements involved (full Java sources are less than 90 lines).

Main loop works for “documents” – 20 words with subsequent doc_id (integer).

  1. Buffer table spb_word4obj is cleared for doc_id to be just inserted.
  2. Words are inserted into buffer table (spb_word4obj),
  3. Then unique new words are inserted into table spb_word
  4. And finally – document’s words are inserted into spb_obj_word – with word bodies replaced by word-ids from spb_word (references).

While iterating this loop for some time (e.g. 2,000 or 15,000 iterations – it is unpredictable) it fails with SQL error – cannot insert null word_id into spb_word. It gets more strange as repeating this very last iteration by hand gives no error. It seems like PostgreSQL have some issue with record insertion and statement execution speed – it loses some data or makes it is visible for subsequent statement after small delay.

Sequence of generated words is repeatable – every time code is run it generates the same sequence of words, but iteration number when code fails is different every time.

Here is my sql code to create tables:

create sequence spb_word_seq;

create table spb_word (
  id bigint not null primary key default nextval('spb_word_seq'),
  word varchar(410) not null unique
);

create sequence spb_obj_word_seq;

create table spb_obj_word (
  id int not null primary key default nextval('spb_obj_word_seq'),
  doc_id int not null,
  idx int not null,
  word_id bigint not null references spb_word (id),
  constraint spb_ak_obj_word unique (doc_id, word_id, idx)
);

create sequence spb_word4obj_seq;

create table spb_word4obj (
  id int not null primary key default nextval('spb_word4obj_seq'),
  doc_id int not null,
  idx int not null,
  word varchar(410) not null,
  word_id bigint null references spb_word (id),
  constraint spb_ak_word4obj unique (doc_id, word_id, idx),
  constraint spb_ak_word4obj2 unique (doc_id, word, idx)
);

And here goes Java code – it may just be executed (it has static main method).

package WildWezyrIsAstonished;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Statement;

public class StrangePostgresBehavior {

    private static final String letters = "abcdefghijklmnopqrstuvwxyząćęłńóśźż";
    private static final int llen = letters.length();
    private Connection conn;
    private Statement st;
    private int wordNum = 0;

    public void runMe() throws Exception {
        Class.forName("org.postgresql.Driver");

        conn = DriverManager.getConnection("jdbc:postgresql://localhost:5433/spb",
                "wwspb", "*****");
        conn.setAutoCommit(true);
        st = conn.createStatement();

        st.executeUpdate("truncate table spb_word4obj, spb_word, spb_obj_word");

        for (int j = 0; j < 50000; j++) {

            try {
                if (j % 100 == 0) {
                    System.out.println("j == " + j);
                }

                StringBuilder sb = new StringBuilder();

                for (int i = 0; i < 20; i++) {
                    sb.append("insert into spb_word4obj (word, idx, doc_id) values ('"
                            + getWord() + "'," + i + "," + j + ");\n");
                }
                st.executeUpdate("delete from spb_word4obj where doc_id = " + j);

                st.executeUpdate(sb.toString());

                st.executeUpdate("update spb_word4obj set word_id = w.id "
                        + "from spb_word w "
                        + "where w.word = spb_word4obj.word and doc_id = " + j);

                st.executeUpdate("insert into spb_word (word) "
                        + "select distinct word from spb_word4obj "
                        + "where word_id is null and doc_id = " + j);

                st.executeUpdate("update spb_word4obj set word_id = w.id "
                        + "from spb_word w "
                        + "where w.word = spb_word4obj.word and "
                        + "word_id is null and doc_id = " + j);

                st.executeUpdate("insert into spb_obj_word (word_id, idx, doc_id) "
                        + "select word_id, idx, doc_id from spb_word4obj "
                        + "where doc_id = " + j);
            } catch (Exception ex) {
                System.out.println("error for j == " + j);
                throw ex;
            }
        }
    }

    private String getWord() {
        int rn = 3 * (++wordNum + llen * llen * llen);
        rn = (rn + llen) / (rn % llen + 1);
        rn = rn % (rn / 2 + 10);

        StringBuilder sb = new StringBuilder();
        while (true) {
            char c = letters.charAt(rn % llen);
            sb.append(c);
            rn /= llen;
            if (rn == 0) {
                break;
            }
        }

        return sb.toString();
    }

    public static void main(String[] args) throws Exception {
        new StrangePostgresBehavior().runMe();
    }
}

So again: is it me doing something wrong (what exactly?) or is it serious flaw in PosgreSQL SQL Engine (than – is there a way for work-around)?

I’ve tested above on Windows Vista box with: Java 1.6 / PostgreSQL 8.3.3 and 8.4.2 / JDBC PostgreSQL drivers postgresql-8.2-505.jdbc3 and postgresql-8.4-701.jdbc4. All combinations lead to error described above. To be sure that it is not something with my machine I’ve tested in in similar environment on other machine.


UPDATE: I’ve turned on Postgres logging – as suggested by Depesz. Here are latest sql statements that were executed:

2010-01-18 16:18:51 CETLOG:  execute <unnamed>: delete from spb_word4obj where doc_id = 1453
2010-01-18 16:18:51 CETLOG:  execute <unnamed>: insert into spb_word4obj (word, idx, doc_id) values ('ouc',0,1453)
2010-01-18 16:18:51 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('rbjb',1,1453)
2010-01-18 16:18:51 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('pvr',2,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('gal',3,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('cai',4,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('żjg',5,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('egf',6,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('śne',7,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('ęęd',8,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('lnd',9,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('cbd',10,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('dąc',11,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('łrc',12,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('zmł',13,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('zxo',14,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('oćj',15,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('zlh',16,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('lńf',17,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('cóe',18,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: 
  insert into spb_word4obj (word, idx, doc_id) values ('uge',19,1453)
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: update spb_word4obj set word_id = w.id from spb_word w where w.word = spb_word4obj.word and doc_id = 1453
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: insert into spb_word (word) select distinct word from spb_word4obj where word_id is null and doc_id = 1453
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: update spb_word4obj set word_id = w.id from spb_word w where w.word = spb_word4obj.word and word_id is null and doc_id = 1453
2010-01-18 16:18:52 CETLOG:  execute <unnamed>: insert into spb_obj_word (word_id, idx, doc_id) select word_id, idx, doc_id from spb_word4obj where doc_id = 1453
2010-01-18 16:18:52 CETERROR:  null value in column "word_id" violates not-null constraint
2010-01-18 16:18:52 CETSTATEMENT:  insert into spb_obj_word (word_id, idx, doc_id) select word_id, idx, doc_id from spb_word4obj where doc_id = 1453

Now – code to check what is wrong in table spb_word4obj:

select * 
from spb_word4obj w4o left join spb_word w on w4o.word = w.word
where w4o.word_id is null

and it shows that two words: 'gal', 'zxo' caused the problem. But… they are found in spb_word table – just freshly inserted with sql statements from log (included above).

So – it is not issue with JDBC driver, it is rather Postgres itself?


UPDATE2: If I eliminate polish national chars (ąćęłńóśźż) from generated words, there is no error – code performs all 50,000 iterations. I’ve tested it few times. So, for this line:

    private static final String letters = "abcdefghijklmnopqrstuvwxyz";

there is no error, everything seems to be fine, but with this line (or with original line in full source above):

    private static final String letters = "ąćęłńóśźżjklmnopqrstuvwxyz";

I get error described above.


UPDATE3: I’ve just posted similar question without Java usage – fully ported to pure plpgsql, look here: Why this code fails in PostgreSQL and how to fix it (work-around)? Is it Postgres SQL engine flaw?. Now I know that it is not related to Java – it is Postgres alone problem.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T13:23:14+00:00Added an answer on May 13, 2026 at 1:23 pm

    My further investigation of the problem revealed that the problem is related to pure Postgres SQL, I developed pure plpgsql version which is one-to-one port of the code above. Restated question for pure plpgsql is here: Why this code fails in PostgreSQL and how to fix it (work-around)? Is it Postgres SQL engine flaw?.

    So – it is not Java/JDBC related problem.

    Furthermore, I’ve managed to simplify test code – now it uses one table. Simplified problem was posted on pgsql-bugs mailing list: http://archives.postgresql.org/pgsql-bugs/2010-01/msg00182.php. It is confirmed to occur on other machines (not only mine).

    Here is workaround: change database collation from polish to standard ‘C’. With ‘C’ collation there is no error. But without polish collation polish words are sorted incorrectly (with respect to polish national characters), so problem should be fixed in Postgres itself.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 360k
  • Answers 360k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer No, you don't need to include them all. However, the… May 14, 2026 at 2:45 pm
  • Editorial Team
    Editorial Team added an answer WiX doesn't have an xmlLocator pattern so you will have… May 14, 2026 at 2:45 pm
  • Editorial Team
    Editorial Team added an answer If you wanted username to be required, I'd make it… May 14, 2026 at 2:45 pm

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.