I got a scenario where Data Stream B is dependent on Data Stream A. Whenever there is change in Data Stream A it is required re-process the Stream B. So a common process is required to identify the changes across datastreams and trigger the re-processing tasks.
Is there a good way to do this besides triggers.
I got a scenario where Data Stream B is dependent on Data Stream A.
Share
Your question is rather unclear and I think any answer depends very heavily on what your data looks like, how you load it, how you can identify changes, if you need to show multiple versions of one fact or dimension value to users etc.
Here is a short description of how we handle it, it may or may not help you:
I think you’re asking how to do step 5, but it depends on the data that changes and what your users expect to happen. For example, if one item in an order changes, we re-process the entire order to ensure that the order-level values are correct. If a customer address changes, we have to re-assign him to a new sales region.
There is no generic way to identify data changes and process them, because everyone’s data and requirements are different and everyone has a different toolset and different constraints and so on.
If you can make your question more specific then maybe you’ll get a better answer, e.g. if you already have a working solution based on triggers then why do you want to change? What problem are you having that is making you look for an alternative?