I have an Oracle database (roughly 1.2 billion records) of data with a web application sitting on top of it that generates queries (generates SQL code and returns counts). Basically you generated SQL queries graphically through an AJAX UI…and it runs pretty nice performance-wise.
This is roughly a 400 GB database. I’ve been looking at Hadoop and thinking about using it instead of Oracle (have my app generate HIVE query code), BUT it seems to me like it’s an overkill….isn’t hadoop targeted more towards tens of terabytes to petabyte scale datasets? Is it suitable in place of a relational database (like Oracle) for the task I’m doing??
Maybe. But it’s suitable to a wide variety of problems. It’s also suitable for very small datasets where the Hadoop “functional” style of programming helps.
SQL is not the perfect query language. It’s just widely-adopted.
Without too many requirements, it’s almost impossible to tell. However, if you’re doing transactional stuff with lots of inserts, updates and deletes, then SQL RDBMS is probably necessary.
If you’re not doing complex transactions; if you’re doing bulk loads and bulk queries, then the database is getting in your way. The file system will be faster. And often simpler.