I use Mysql and I need to insert mass data. The data is streamed to my server in the form of a list of 5k rows. I need to insert more than 3k requests, that means 3k request * 5k rows = 15 000 000 rows.
What I did was used create threads and insert using those threads, as the data come in packets of 5k in an async event. The data response is generated on my request.
What is the best possible way to do it, keeping this scenario in mind?
ThreadPooling for thread managment or simple multithreaded applition and will threads benifit in insertion as I need to insert in a single table (Innodb engine)
You can cache incoming requests on a server. Keep some buffered data in-memory until you get N requests (which you can fine-tune later). Once you get those you just flush data into MySql using some bulk insert routine. It is generally much faster to do one big insert than many small ones.
You can use
ConcurrentBagclass to keep data on the server. This is a thread-safe collection.Additionally, you may need to expire cache based on time. This will cover the case where you get some requests n < N and then a client just stops sending data. You would want to flush it anyways and not wait forever until next upcoming requests fully fill the cache.