Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 440739
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T20:54:10+00:00 2026-05-12T20:54:10+00:00

I’m quite a newbie with PostgreSQL optimization and chosing whatever’s appropriate job for it

  • 0

I’m quite a newbie with PostgreSQL optimization and chosing whatever’s appropriate job for it and whatever’s not. So, I want to know whenever I’m trying to use PostgreSQL for inappropriate job, or it is suitable for it and I should set everything up properly.

Anyway, I have a need for a database with a lot of data that changes frequently.

For example, imagine an ISP, having a lot of clients, each having a session (PPP/VPN/whatever), with two self-describing frequently updated properties bytes_received and bytes_sent. There is a table with them, where each session is represented by a row with unique ID:

CREATE TABLE sessions(
    id BIGSERIAL NOT NULL,
    username CHARACTER VARYING(32) NOT NULL,
    some_connection_data BYTEA NOT NULL,
    bytes_received BIGINT NOT NULL,
    bytes_sent BIGINT NOT NULL,
    CONSTRAINT sessions_pkey PRIMARY KEY (id)
)

And as accounting data flows, this table receives a lot of UPDATEs like those:

-- There are *lots* of such queries!
UPDATE sessions SET bytes_received = bytes_received + 53554,
                    bytes_sent = bytes_sent + 30676
                WHERE id = 42

When we receive a never ending stream with quite a lot (like 1-2 per second) of updates for a table with a lot (like several thousands) of sessions, probably thanks to MVCC, this makes PostgreSQL very busy. Are there any ways to speed everything up, or Postgres is just not exactly suitable for this task and I’d better consider it unsuitable for this job and put those counters to another storage like memcachedb, using Postgres only for fairly static data? But I’ll miss an ability to infrequently query on this data, for example to find TOP10 downloaders, which is not really good.

Unfortunately, the amount of data cannot be lowered much. The ISP accounting example is all thought up to simplify the explanation. The real problem’s with another system, which structure is somehow harder to explain.

Thanks for suggestions!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T20:54:11+00:00Added an answer on May 12, 2026 at 8:54 pm

    The database really isn’t the best tool for collecting lots of small updates, but as I don’t know your queryability and ACID requirements I can’t really recommend something else. If it’s an acceptable approach the application side update aggregation suggested by zzzeek can help lower the update load significantly.

    There is an similar approach that can give you durability and ability to query the fresher data at some performance cost. Create a buffer table that can collect the changes to the values that need to be updated and insert the changes there. At regular intervals in a transaction rename the table to something else and create a new table in place of it. Then in a transaction aggregate all the changes, do the corresponding updates to the main table and truncate the buffer table. This way if you need a consistent and fresh snapshot of any data you can select from the main table and join in all the changes from the active and renamed buffer tables.

    However if neither is acceptable you can also tune the database to deal better with heavy update loads.

    To optimize the updating make sure that PostgreSQL can use heap-only tuples to store the updated versions of the rows. To do this make sure that there are no indexes on the frequently updated columns and change the fillfactor to something lower from the default 100%. You’ll need to figure out a suitable fill factor on your own as it depends heavily on the details of the workload and the machine it is running on. The fillfactor needs to be low enough that allmost all of the updates fit on the same database page before autovacuum has the chance to clean up the old non-visible versions. You can tune autovacuum settings to trade off between the density of the database and vacuum overhead. Also, take into account that any long transactions, including statistical queries, will hold onto tuples that have changed after the transaction has started. See the pg_stat_user_tables view to see what to tune, especially the relationship of n_tup_hot_upd to n_tup_upd and n_live_tup to n_dead_tup.

    Heavy updating will also create a heavy write ahead log (WAL) load. Tuning the WAL behavior (docs for the settings) will help lower that. In particular, a higher checkpoint_segments number and higher checkpoint_timeout can lower your IO load significantly by allowing more updates to happen in memory. See the relationship of checkpoints_timed vs. checkpoints_req in pg_stat_bgwriter to see how many checkpoints happen because either limit is reached. Raising your shared_buffers so that the working set fits in memory will also help. Check buffers_checkpoint vs. buffers_clean + buffers_backend to see how many were written to satisfy checkpoint requirements vs. just running out of memory.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
Does anyone know how can I replace this 2 symbol below from the string
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I want to count how many characters a certain string has in PHP, but
I am currently running into a problem where an element is coming back from
Seemingly simple, but I cannot find anything relevant on the web. What is the
this is what i have right now Drawing an RSS feed into the php,
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.