If you were to design a transparent SMTP proxy in C# (.net 4) to meet the following initial requirements
- Scales well
- Logs all traffic to a
database - Can be extended easily say for virus scanning attachments
Considering these factors broadly speaking how would your design look? Would you create Listener, Sender and logger concrete classes or something more abstract? And would you use callbacks, threads or processes and why?
This is a non-trivial application. Some ideas that should help:
SMTP Scalability
In general, scaling network application means being able to scale out (as in more machines) rather than up (a bigger expensive machine). This means being able to have multiple servers be able to handle SMTP requests. Note that this will likely need to have support at the network level (routers that can distribute messages to an ‘SMTP farm’).
Yes, to make an SMTP scale and peform, you’ll likely want to utilize multiple threads (likely from some sort of thread pool). Note that a multithreaded sockets implementation is not trivial.
In terms of processes, I think one process (likely a Windows Service) with multiple threads for each SMTP server is a good way to go.
Database Scalability
Keep in mind that the database can be a scalability bottleneck as well. To design for large loads, you would want to be able to horizontally scale your data tier as well. That means being able write to more than one db server. That leads to being able to report from a set of database servers (which is much more complicated than reporting from one).
SMTP Reliability
Is this a concern / requirement? If so, this is another reason for supporting a farm (well, if we have multiple server for reliability we might call it a cluster) of servers instead of just one. Note that the farm would have to have a way of letting the cluster know that it has failed (through some sort of heartbeat mechanism perhaps).
Database Reliability
To make the database reliable, you would have to do some clustering as well. This is neither cheap or trivial (but has been done a number of times with a number of database platforms).
Queuing
One way to handle surges in server load is to queue messages. This way, the server can keep passing messages through, but you’re not waiting for the chain of extensible modules to finish their processing. Note that this adds another layer of complexity and a point of failure to the system.
Extensibility
One way to approach adding functionality such as database logging and attachment scanning is to add a chain of “MessageInsepctors” or “MessageHandlers”. You would probably want to allow configuration of these in a particular order (e.g. virus scan before logging so you don’t log infected items).
Another aspect to consider is which plug ins can block a message from passing through (such as a virus scanner) and a plug in that can execute after the message has passed (logging).
In terms of adding the plug in support, you could use something like MEF (Managed Extensibility Framework).
Reinventing the Wheel
Putting all of this functionality into place would take a considerable amount of development time. It might be cheaper / faster / easier to just purchase a solution off the shelf that does all of this for you (this problem has already been solved a number of times).