Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7520159
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T02:01:04+00:00 2026-05-30T02:01:04+00:00

I’m sorry in advance if this question is flawed. I’m pretty new to databases(I

  • 0

I’m sorry in advance if this question is flawed. I’m pretty new to databases(I have set them up but not used them much in my development learning).

BackGround:
I have a process that generates alot of test data, its basically a hashtable with several hundred million records every day(but at the end of the day I can delete those records). Generating the data takes too long on one machine so I’m splitting the process up over several servers, which basically need to look up a database(or currently hashtable) and if it exists do some work and if it doesn’t exist then add it. I think(so far) my needs is a database that can handle the large amount of writes in a consistent way(i.e. updates should be avail. instantly) and the database should be able to effectically transfer this table over the network to other worker nodes(after the table is created another job runs that is based on it, but I don’t think a single server server a 10+ gig table to several servers is efficent so I was thinking it needs to be distributed).

Problem/Question:
If I use a NoSql solution, like Hbase(which I have a bit of experience setting up), will my application logic work? If I have 2 servers writing to a distributed database, is there any chance that server1 added an entry but when server2 looks it up it can’t find it because it hasn’t replicated though the cluster yet? Also, is there a better way to do what I’m trying to do? Would a single server(I also am considering just using mysql) with no distribution work better(I was avoiding it because I wanted a solution that if was too slow I could simply add more worker servers to write to a database, I’m not sure if my performance returns would diminish if I add 100 workers to write to a single server)?

Any tips or suggestions would be great.

Thanks!

Update: I just realized that facebook’s messaging infrastructure uses hbase. If it was not consistent that I would be getting crazy delays when messaging my friends. So how does hbase stay consistent(or is it really not consistent and facebook is so fast that it seems that way)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T02:01:05+00:00Added an answer on May 30, 2026 at 2:01 am

    If I have 2 servers writing to a distributed database, is there any chance that server1 added an entry but when server2 looks it up it can’t find it because it hasn’t replicated though the cluster yet?

    HBase, in particular, has guaranteed consistency. This means that once a write operation has been completed, the data written will be available to all clients. This write operation, however, does not happen instantly, so that must be taken into account.

    Other NoSQL database engines, such as Cassandra, support what is called “eventual consistency”, which trades absolute consistency for write speed. This means that a piece of data written to the cluster will EVENTUALLY be consistent across nodes, but it may take some time — typically this period of time is very short. More information on such a trade-off can be found here.

    It is my supposition that you would prefer the guaranteed consistency of HBase.

    Also, is there a better way to do what I’m trying to do?

    This depends on what your records are going to look like. Could you provide more information on the data you’ll be storing? If your data fields cater to a document model — you typically require all of the fields when accessing data for a given key — then you could look into various document based data stores, such as MongoDB. MongoDB offers various levels of consistency (the default, rather conveniently, is to guarantee consistency like HBase).

    If you will often times be looking for some subset of the fields stored per each key, then HBase will help minimize the amount of data you’re sending over the network by allowing you to specify which columns you wish to receive from a scan or get.

    Would a single server… with no distribution work better(I was avoiding it because I wanted a solution that if was too slow I could simply add more worker servers to write to a database, I’m not sure if my performance returns would diminish if I add 100 workers to write to a single server)?

    The distributed database engines will certainly perform better under concurrent reads/writes. Due to the aforementioned properties, HBase is considered to be strong in read heavy scenarios (writes aren’t live until they are syndicated) while Cassandra and other eventually consistent database engines are considered to be strong in write heavy scenarios (though Cassandra’s latest release has seen significant performance gains in reading).

    A traditional database running on a single server will suffer when the read/write load increases, as it will have to queue incoming connections as well as disk operations once they have reached their perspective rate limits. I believe HBase (or MongoDB, should you decide a document store could work for you) would suit your needs for consistency the best.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I don't have much knowledge about the IPv6 protocol, so sorry if the question
this is what i have right now Drawing an RSS feed into the php,
I have this code to decode numeric html entities to the UTF8 equivalent character.
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I have this code: - (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock { NSString *someString = [[NSString
I have some data like this: 1 2 3 4 5 9 2 6
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
I used javascript for loading a picture on my website depending on which small

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.