For a business use case where we have to deal with minimum “2-3 terabyte” of data per day, I was doing analysis on “Hadoop & Storm”.
Needless to say that “Storm” looks impressive because of its efficiency in processing incoming big data but I am not sure whether “Storm” will be capable enough to process “Terabyte” of data and at the same time providing me real-time results or not ?
Can anyone explain please?
Thanks,
Gajendra
Storm was developed by twitter. they process more than 8 TB per day with it. Sounds like this should be enough for your case. Afaik storm is the best streaming/realtime system for distributed computing. hadoop is not suitable for it due to job start up times and not native handling of streaming data.
a fact is, both can handle the data per day you wish when you have enough server power and storage etc.