I am developing a Job application which executes multiple parallel jobs. Every job will pull data from third party source and process. Minimum records are 100,000. So i am creating new table for each job (like Job123. 123 is jobId) and processing it. When job starts it will clear old records and get new records and process. Now the problem is I have 1000 jobs and the DB has 1000 tables. The DB size is drastically increased due to lots of tables.
My question is whether it is ok to create new table for each job. or have only one table called Job and have column jobId, then enter data and process it. Only problem is every job will have 100,000+ records. If we have only one table, whether DB performance will be affected?
Please let me know which approach is better.
Don’t create all those tables! Even though it might work, there’s a huge performance hit.
Having a big table is fine, that’s what databases are for. But…I suspect that you don’t need 100 million persistent records, do you? It looks like you only process one Job at a time, but it’s unclear.
Edit
The database will grow to the largest size needed, but the space from deleted records is reused. If you add 100k records and delete them, over and over, the database won’t keep growing. But even after the delete it will take up as much space as 100k records.