I am building a system for my client. It’s a lot like http://www.getafreelancer.com.
There are 2 types of user: Service Provider and Service Buyer.
Service Buyers post projects.
Service Providers are notified of any new projects posted, which fit into their classifications.
Assume:
There are approx 100 qualifications. Service Provider can choose 10-15 qualification.
Now approx 100 projects are posted daily. And there are 1 million service provider.
And problem is we have to send email notifications to the service providers matching their chosen qualifications with the project category. (For all 100 projects daily).
It would be like proceeding user by user. Checking their qualification with the project category and sending them email. How can I send 20 * 1 million = 20 million emails daily?
(Currently there won’t be 1 million users. But programming must be done with future requirements)
Please provide some suggestions.
My question is: Will I need any special hardware to send 1 million emails?
From the way you pose the question, it’s a little hard to figure out what you want.
If you want to send email in near-real-time, life is fairly simple. Some buyer has a need, and as soon as they post it, you send out a bunch of notifications. This will be hard to scale, but the pain will come on the email delivery side. Simply generating a list of providers to inform is fairly trivial assuming your database design and infrastructure is less than awful.
If that’s what you’re doing, you just want to make sure that you’ve got your email delivery platform set up in a smart way. Design your infrastructure so your core system can easily delegate batches of email to N slave servers that can handle doing some mail-merge and give you enough SMPT throughput.
If you want to send email periodically, so that providers receive just one email with several jobs, you’ll want to think about things a little differently.
In this case, consider an architecture based around segmentation of your list of providers, and database replication to handle the read-load.
In this model, a buyer posts a job. Your main database gets updated, and some number of slaves get the data via replication. Each slave is responsible for keeping providers up-to-date via email. Every N hours, each slave reads from its dedicated, read-only, database and handles its segment of users.
You’ll need to work out the details of re-segmenting your providers when you bring a new notification system online. For instance, when you add your third notification system, it should end up serving approxmiately 1/3 of the load from each of the two existing ones.