I am currently researching what database to use for a project I am working

Question

0

Asked: June 9, 20262026-06-09T17:45:40+00:00 2026-06-09T17:45:40+00:00

I am currently researching what database to use for a project I am working

0

I am currently researching what database to use for a project I am working on. Hopefully you guys can give me some hints.

The project is an automated web crawler that checks websites as per a user’s request, scrapes data under certain circumstances, and creates log files of what was done.

Requirements:

Only few tables with few columns; predefining columns is no problem
No overly complex associations between models
Huge amount of date & time based queries
Due to logging, database will grow rapidly and use up a lot of space
Should be able to scale over multiple servers
Fields contain mostly ids (int), strings (around 200-500 characters max), and unix timestamps
Two different types of servers will simultaneously read/write data directly to/from it:
- One(/later more) rails app that takes user input and displays results upon request
- One(/later more) Node.js server that functions as the executing crawler/scraper. It will have enough load to run continuously and make dozens of database queries every second.

I assume it will neither be a graph database (no complex associations), nor a memory based key/value store (too much data to hold in cached). I’m still on the fence for every other type of database I could find, each seems to have it’s merits.

So, any advice from the pros how I should decide?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T17:45:42+00:00

I would agree with Vladimir that you would want to consider a document-based database for this scenario. I am most familiar with MongoDB. My reasons for using it here are as follows:

Your ‘schema requirements’ of “only a few tables with few columns” fits well with the NoSQL nature of MongoDB.
Same as above for “no overly complex associations between nodes” — you will want to decide whether you’d prefer nested documents or using dbref (I prefer the former)
Huge amount of time-based data (and other scaling requirements) – MongoDB scales well via sharding or partitioning
Read/write access – this is why I am recommending MongoDB over something like Hadoop. The interactive query requirement is best met by something other than a Hadoop-style store, as this type of storage is designed for batch (rather than interactive query) requirements.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently researching what database to use for a project I am working

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply