I’m currently about to start designing a new application.
The application will allow a user to insert some data and will provide data analysis (with reports as well), i know it’s not helpful but the data-processing will be done in post-processing so that’s not really interesting for the front-end.
I’d like to start with the right path to help myself when there will be the need to scale to handle more users.
I’m thinking about PostgreSQL to store the data, because I’ve already used it and I like it (also if a NoSQL would be a good choice -since not all data needs to have a relation- I like the Postgres support and community and I feel better knowing that there’s a big community out there to help me), MySQL (innodb) is also a good choice, tbh I’ve not a real reason to choose it over PostgreSQL and vice-versa (is maybe MySQL easier to create shards?).
I know several programming languages but my strengths are Python, C/C++, Javascript.
I’m not sure if I should choose a sync or async approach for this task (I could scale out by running more sync applications behind a load balancer).
I’ve already developed another big-size project that teached me a lot of things about concurrency, but there each choice was influenced according to the (whole rest of the team, but mostly by the) sysadmin skills, so we have used python (django) + uwsgi + nginx.
For this project (since it’s totally different from the other – that was an e-commerce, this is such a SaaS) I was also considering to make use of node.js, it would be a good opportunity to try it out in a serious project.
The most heavy data processing would be done by post-processes so all the front-end (user website) would be mostly I/O (+1 to use an async enviroment).
What would you suggest?
ps. I must also keep in mind that first of all the project has to start, so I cannot only think about each possible design, but I should start writing code ASAP 🙂
My current thoughts are:
– start with something you know
– keep it as simple as possibile
– track everything to find bottlenecks
– scale out
So it wouldn’t really matter if I deploy sync or async, but I know async has much better performances, and each thing that could help me to get better results (ergo lower costs) is evaluable as well.
I’m curious to know what are your experiences (also with other technologies)…
I’m becoming paranoid about this scalability and I fear it could lead to a wrong design (it’s also the first time I’m designing alone for a commercial purpose = FUD)
If you need some more info please let me know and I’ll try go give to you an answer.
Thanks.
Here are some basic guidelines:
You dont necessarily have to do this from the get go – but having this ability will go a long way to scale your app when the time comes.
Also, remember that theses approaches are not exclusive. You should design your app for all these approaches; but only implement it when required.
Take a look at the book The Art of Scalability
This book was written by guys that worked with eBay & Paypal.