Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 769901
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T18:24:44+00:00 2026-05-14T18:24:44+00:00

I’m studying the best data structures to implement a simple open-source object temporal database,

  • 0

I’m studying the best data structures to implement a simple open-source object temporal database, and currently I’m very fond of using Persistent Red-Black trees to do it.

My main reasons for using persistent data structures is first of all to minimize the use of locks, so the database can be as parallel as possible. Also it will be easier to implement ACID transactions and even being able to abstract the database to work in parallel on a cluster of some kind.
The great thing of this approach is that it makes possible implementing temporal databases almost for free. And this is something quite nice to have, specially for web and for data analysis (e.g. trends).

All of this is very cool, but I’m a little suspicious about the overall performance of using a persistent data structure on disk. Even though there are some very fast disks available today, and all writes can be done asynchronously, so a response is always immediate, I don’t want to build all application under a false premise, only to realize it isn’t really a good way to do it.

Here’s my line of thought:
– Since all writes are done asynchronously, and using a persistent data structure will enable not to invalidate the previous – and currently valid – structure, the write time isn’t really a bottleneck.
– There are some literature on structures like this that are exactly for disk usage. But it seems to me that these techniques will add more read overhead to achieve faster writes. But I think that exactly the opposite is preferable. Also many of these techniques really do end up with a multi-versioned trees, but they aren’t strictly immutable, which is something very crucial to justify the persistent overhead.
– I know there still will have to be some kind of locking when appending values to the database, and I also know there should be a good garbage collecting logic if not all versions are to be maintained (otherwise the file size will surely rise dramatically). Also a delta compression system could be thought about.
– Of all search trees structures, I really think Red-Blacks are the most close to what I need, since they offer the least number of rotations.

But there are some possible pitfalls along the way:
– Asynchronous writes -could- affect applications that need the data in real time. But I don’t think that is the case with web applications, most of the time. Also when real-time data is needed, another solutions could be devised, like a check-in/check-out system of specific data that will need to be worked on a more real-time manner.
– Also they could lead to some commit conflicts, though I fail to think of a good example of when it could happen. Also commit conflicts can occur in normal RDBMS, if two threads are working with the same data, right?
– The overhead of having an immutable interface like this will grow exponentially and everything is doomed to fail soon, so this all is a bad idea.

Any thoughts?

Thanks!

edit:
There seems to be a misunderstanding of what a persistent data structure is:
http://en.wikipedia.org/wiki/Persistent_data_structure

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T18:24:45+00:00Added an answer on May 14, 2026 at 6:24 pm

    If you find you are getting bottlenecked on write time, or that your durability guarantee is meaningless without synchronous writes (hmm…), you should do what most other databases do: implement a Write-Ahead Log (WAL), or a redo-log.

    Disks are actually pretty darn good at writing sequentially, or at least that’s what they’re best at. It’s random writes (such as those in a tree) that are terribly slow. Even flash drives, which beat the hell out of disks for random writes, are still significantly better at sequential writes. Actually, even most RAM is better at sequential writes because there are fewer control signals involved.

    By using a write-ahead log, you don’t have to worry about:

    • Torn writes (you wrote half a tree image before the cat ate your power supply)
    • Loss of information (you didn’t actually get to persisting the tree, but Joe thinks you did)
    • Huge performance hits from random, synchronous disk I/O.
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 442k
  • Answers 442k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Google crawler does support AJAX crawling. Described here. Also some… May 15, 2026 at 5:56 pm
  • Editorial Team
    Editorial Team added an answer Juriy Zaytsev (@kangax) just today released a really good article… May 15, 2026 at 5:56 pm
  • Editorial Team
    Editorial Team added an answer Try readfile instead of file_get_contents. http://www.php.net/manual/en/function.readfile.php May 15, 2026 at 5:56 pm

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.