I need to set up a dll that receives data (text strings) from different processes and threads. Putting it into threads that feed this data to a que to be sorted together (strings from all threads) and then store that in a file.
I can set up a dll. I can set up threads. I can make my threads thread safe. I can put data in a file and file stream. But questions remain:
• How can I set up a presumably static set of data so that multiple threads and processes add text to the same single shared data set in the dll? Can I set up a file stream the same way? How?
• Can separate processes use a single dll in this way? I thought a separate processes would not be able to see a different process’ static data found in a dll. Is it possible, or do I have to look at doing something more complicated?
EDIT:
Thank you all so much. While I may not have an exact solution, your comments have pointed me in a direction that is useful and cleared up some confusion that I once had and that is what I needed. I consider this question answered.
2) static vs process: it absolutely depends on what exactly you mean by “static data”.
For a DLL, a “static data” is a set of bytes placed somewhere inside the module, guarded with readonly flags. Whoever loads the DLL sees that same bytes, because they are part of the DLL itself. Of course, that data is mostly immutable and determined at the compile/build time. This is for example how “resources” are kept and available in assemblies.
For C# code, “static” data – fields, properties and events – are just a global variables that are encapsulated with some handy names in form of namespaces and classes. They are not completely global: the .Net has a notion of AppDomain, similar to JRE’s class loaders, which allow you to run separate .Net applications in the same process – and those applications does not overwrite each other memory, even if they run exactly the same code with the same static fields. Even more to it, you may mark a static field with [ThreadLocal] attribute to make the field not “globally global” but just “global per thread”, and every thread in your app will have its own separate “static” version of that fields. Etc Etc.
If you say about processes, there are no ways to “share and communicate via DLL” in the form I suspect you have in your mind. DLL sharing is all about sharing the common code. Data memory is separate by definition of a “process”, and the data of course resides in the process’ memory.
Speaking at lower levels of the system, there actually IS some sharing at the level of virtual memory subsystem. If a code module is shared, the system might notice that the same 99 processes uses the same DLL file, and it might decide to load it only once and them map the single pages of that file to similar pages of multiple processes’ memory. That way, it is loaded once and used multiple times, and true sharing occurs. Mind however, that the code is loaded and shared, not the dynamically-allocated memory. Lower-level languges were able to “exploit” that kind of sharing and they actually were able to lift the code’s readonly protection and write to the code memory and thus have their data propagated automatically to all processes that shared the same pages, but this is currently considered evil 🙂
Putting memory mapping aside, all of that means that having a DLL does not help you with communication much.
1) For threads – it is possible and you sound to know how, for processes – you cannot. Period. See above.
Now, to solve your problem: At the core of it lies that you want to communicate between processes. The topic is called in short “IPC” or “inter-process communication”. Classical ways of dealing with it are, for starters:
and so on. If you think little about how to use that – it gets quite simple: you prepare one special process that gathers all the data form others, for example by a pipe or a network/socket connection, and after that the process simply does the job in a typical way. This is your “broker” or “service” process. It is rather hard to avoid having such process, as you want the data to be collected and uniformly sorted – something must arrange the ordering, and that something must have (most) of the data at hand to perform it. Once you have that in mind, you may notice that the service process does NOT have to be separate. One of your “working” processes (that generate the data) may also handle the sorting job. All that is needed is to orchestrate it somehow so that there will be one data sink, and that everyother will know who is the sink. I’ll stop the story here.
If, somehow, you start wondering why shall there be only one sink and why a process have to have a complete knowledge of all the data to be sorted – it actually is not required. There are quite a few smart sorting algorithms that are used on multicore/multiprocess machines (or even distributed platforms) that are able to perform sorting in parts, and then glue everything together so it almost instantly is already sorted as a whole. They are a bit harder to understand than a simple “common global datasink service”, but once you understand them, it may happen that writing such algorithm on file-based storage may be faster/simplier than writing IPC over sockets or pipes.
However, leveraging C# libraries, I suppose that doing IPC via webservices, .Net Remoting, or common database (sql express? mysql?) will be a good start for you. Leave the pipes, sockets, memmaps and other thingies for later time, when you are comfortable with orchestrating many processes.
Choose one concrete communication mechanism and ask about it, it will be easier to find/explain.