Long story short, I have a Java process that reads and writes data to/from a process. I have a C++ program that takes the data, processes it and then needs to pass it back to Java so that Java can write it to a database.
The Java program pulls its data from Hadoop, so once the Hadoop process kicks off, it gets flooded with data but the actual processing(done by the C++ program) cannot handle all the data at once. So I need a way to control the flow as well. Also to complicate the problem(but simplify my work), I do the Java stuff and my friend does the C++ stuff and are trying to keep our programs as independent as possible.
That’s the problem. I found Google protocol buffer and it seems pretty cool to pass data between the programs but I’m unsure how the Java Program saving data can trigger the c++ program to process and then when the c++ program saves the results how the Java program will be triggered to save the results (this is for one or a few records but we plan to process billions of records).
What is the best approach to this problem? Is there a simple way of doing this?
The simplest approach may be to use a TCP Socket connection. The Java program sends when you want to be done and the C++ program sends back the results.