I’m thinking about the problem in question title: if I have to query for an aggregate in a distributed architecture where the distributed event store can eventually be waiting for last events to be distributed.. How can I know if the aggregate i’m reading via read model is not being replaced by the updated one in another server of the network?
I have an http server that receive events to save on the store. Store not exists actually but I want implement it soon.
Events regards huge aggregate that serialized in json format takes 4MB
Another sub-question is what storage do you recommend for the snapshot?
EDIT
I don’t understand if the question is not written well or if I have selected wrong tags…
The ability to know when the “last” event in the distributed store is processed depends on two things:
The CAP theorem is a good reference to the sort of problems you are going to have with both of those in a distributed data store; in general, unless you give up availability you are not going to be able to have the properties needed to get what you want.
On the other hand, if you can define last in a meaningful way, you can still have what you want. For example: do your events expire after a while? If, for example, they expire after 12 hours, you know that you can always meaningfully define last as “the moment in time 12 hours ago”, because any unprocessed event older than that is obsolete…
To answer your sub-question, I strongly recommend a storage engine that you do not write yourself, because distributed data storage is an awesomely hard problems that many very smart people, working for companies doing nothing but solving problems in this space, are doing for you.
Leverage their work instead.